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Abstract 

This report describes a 'learning program' that acquires much of the 
knowledge required by a parsing system that processes conversations in a 'natural' 
language akin to ham-radio jargon. The learning program derives Information from 
example sentences taken from transcripts of actual conversations, and uses this 
knowledge to extend the 'core' augmented transition network (ATN) grammar. The 
parser can use the extended grammar to process the example sentences, plus a 
large numt)er of syntactically and semantically related oerrtences. 

The learning program uses a set of heurettes to detwmlne the difference 
between the existing version of the grammar and a superset that could process the 
example sentence. A set of models act as temples to produce possible extensions 
to the grammar. An evaluation measure selects one of the extensions and adds It to 
the grammar. This extension is henceforth an integral component of the knowledge 
b^e and may be used by the parser to process conversations and by the learning 
program to extend the grammar further. 

This report relates the mechanisms used by tf>e learnir^g program to 
grammatical inference of context-sensitive languages, which include the natural 
languages, and some proposed linguistic models of human language acquisition. 
These models describe language acquisition as a process of developing hypotheses 
according to the constraints of innate universal rules, and acceptance of those 
hypotheses that make it possible for the chHd to understand new sentences. 
Similarly, the learning program devetops Its hypotheses within the constraints of 
certain 'universal' models and accepts only those hypotheses that enable the parser 
to process the motivating example. 
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1. Introduction 

1.1 Motivation 

As computer technology advances, computers are being applied to more 
complex tasks that require increasingly greater 'domain-specific' knowledgfe. One of 
the pressing goals of computer science aiKl engineering Is to deterrhirw how to 
incorporate this knowledge into computer systems In arrielfkHent way. 

There are two major appro£K;hes In current uasi ttial atfleiT^t to solve this 
problem. One approach In current use is the devekH|fW«t jrf various 'tools' 
specif kjally tailored for installing the domaln-speciflj knpMfledge, Including 
very-high-level languages and special-purpose editors. Another approach, which 
has met with consWerably less success, Is to let the compter de mo^ of the work ijf 
acquiring the informatkMi. This report describes a compu^ program^ that acquires 
rroich of the knowledge necessary to perform its task. 

The task in this case fe parsing human conversaticM^ in a very iiinlted domain. 
The conversations take place between operators on Mor^ cbcte radfo-networks In a 
simple 'natural' language akin to ham-radfo jargon, whei# the possHale toptes of 
clonversation are limited by radio network protocol to siK*» thin^ as efii^>Uelitog 
contact, discussing and sending messages, re-sefKlIng garbled parts of the 
messages, and ending contact. In tandem with a transcription system, the parser 
processes the hand-sent Morse code to produce a human-readable transcript and 
Information summary. The domain-speciffe knowledge required by the parser 
consists of the discourse structure and the syntax and semantics of the language, 
and this knowledge is organized as an augmented transition network (ATN). 
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However, the programnFier who developed the original parser was not able to 
incorporate enough domain-specific knowledge into the system to parse all, or even 
most, of the actual conversations that ocoir in this domain, simply because tt^ls 
information is not avfrilable m its totality. Howesror, one^can expect that as the parser 
performs its tasl<, transcripts of convsersartionslhat ttcan not process adequately with 
its current knowledge base will become av»table. tt was d^ira4>te to develop a 
mechanism by which the system could extend its knowledge base, given the new 
transcripts, in a way that enables it to corriectly proaess each of the new 
transmteslons (or sentwices) in these example eon^^rsatons, plus a large number of 
similar transmissions. 

A computer program with ttTese E^iliti^ wo^d incoiporate a high degree of 
learning ability. Winston [23] describes the le\®ls erf learning ability as a shift of 
effort from the teacher to the student. His four levels include learning by being 
programmed, learning by being told, learning by example, and learning by discovery. 
The original domain-specific knowledge incorporated i^ the programmer into the 
Syst^n d^cribed in this report is an ©j^unple of Meaning by being programmed'. A 
system that was explicitly guided by some teacher In Its acquisition of knowledge, 
with the instructions of th© teadier phrased in ^t\e language of the domain rather 
than some programming language, would be 'learning by being told'. The program 
described here at times must learn by being toW, for the progrsBT* must sometimes 
ask qi^stions of the human supplying examples mid ttie human responds in the 
language of the domain. However, for the most part ttiis program 'learns by 
example': tfie program derives the ability to parse new sentences and phrases from 
the examples of sentences and phrsees pr^eoted to It. 
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One approach to d€»«^opaig a computer pfogram that could acQutre such 
knowledge, or •teani\ from«xan9>tes is tol^mwr from^tfie^^ the leaminQ 

processes of humans, the m@st success^ ^learninfi fnacMnes' to diie. Hot^^ 
hum^ leaning processes are incompletsiy uhderstoodt ll^irrent ttieories si^{^ 
that they consi^ in part of forming §eiiereM2atioiisfrOftr<latdand»deriiAng rules Hrom 
them. The correct i^ipHcation of ttiese mlesi^ tbe leamer demonstrates tfurt 
something h^, indeed, t)eeniaaFned. 

One well-t^iown example of huroaiileaTilKigttiat seems^ on ^ surfape, very 
similar to the problem at hand is the accpii^tioR^of lan^ui^e^ cNtdren. Humans 
acquire their first language almost entirely by hearmg it spoken. Th#gewersteaMon 
(^ dataf<^oiMS very quk^ as d)ik:lren leanitQ pfoduce.granwnatical^ef^ei^^es with 
no fonnal instructkHi in the gramn^r of ^leir naHveJaii^ilige; ttiay Infer #e rules pf 
tie^ gretfnmarfr(»n tie Sentences ttiey heeu' ^xjkeiiil^i 

Some linguistic models propoeed by Qiomsky [4, 5} make ttie coiMiOversM 
proposition VtvaA a cNk) may know about oei^iin aspects of language; some 
knowledge is inn^to ^k! me cf^id need nofcleam these aspects in Vne ususd sense. 
These Ornate aspects of tenguage are eaMed1h&ut^immi^m9mmar and, ao&mtmQ 
to these models, form the basis for forming genei^sflAlQns and derlvvis rules from 
the utterances tftat tt)e ct»ld fleers spoken. 

The syslem ctescribed tiere borrows some aapeds dl #)ese linguistic mo^eto 
«)at seem par^cularty appropriale tor extensioii of ttie gf»fimar feed 1^ the parsing 
program, and incorporaies them in a separate leaoMfi^ program tfiatlricHidesalilBie 
domain kiformation of the on^Mtf parser ami can operale on tie same grammor. 
This does not mean thsA the resuSQng computer program models human language 
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acquisition in any psychologically reailstic s^Tse. However, the research described 
here demonstrates that theories that attempt to explain human learning processes 
are also useful for developing compute progr»W8 tial acquire knowledge. 

Previous work in this area has concentrated on the envelopment of algorithms 
for the inference of formal grammars from very large sete of eitamples. The problem 
of Infen-ing an exact grammar for an arb^ary (but constrained) language has been 
solved for the regular languages [3, 12, 14j, and some very restricted subsets erf the 
context-free languages [6, 7, 8, 17, 22], However, there has been very little progress 
toward the development of a general and praGfe;alii«©Gh€B»sra for diarlving 
grammars for the more powerful context-sensitive language, whidn include all 
natural languages. TWs research represents a «ep^^t©«^wd*hl» gold. 

1.2 Organization 

The result of this research is a learning program called MAGE (Morse 
Automatic Grammar Extension system). MAGE uses a 'domain model' that includes 
information about the simple language and the environment in which it is used, a 
small 'core' grammar organized as an ATN, and some knowledge about what type of 
result it is expected to produce. MAGE is designed to receive individual examples of 
sentences from the language and extend the grammar so that it can parse each 
example, plus a large number of similar sentences. An arbitrary number of examples 
may be provided to produce an arbitrarily large grammar. 

MAGE uses a set of heuristics to determine the difference between the 
grammar and a superset of the grammar that would be able to process the example 
sentence. It uses a set of models as templates to 'enumerate', or list, a set of possible 
extensions to the grammar that might bridge this difference. A unique 'evaluation 
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measure' guides the enumeration process, to ks&p ttie M of possible extensions 
workably short, and selects one of these extensions, «*»fch is then added to ttie 
grammar. The evaluation measure is based on tt» abil^ of the grammar to extract 
anportant Information Irom conversations: an ©xtenslefi is enumerated onty if It 
provides a mechaniaTi for parsing the new pfvase, without considering the context, 
and an extension is selected only if it n^rftes it poa^bie for the entire example 
containing the new phrase to be parsed by the standard ATN parswig algorithm that 
te used ^ a tester. 

The process outlined above is anaiogoMS, in some aspecte, to Hngoistic 
models developed by Chomsky [4, 51 and Date {9} of ^ te»i^ing mechaiisms used 
by children when acquiring a native language. According to these modeisi the child 
has Innate knowledge of a universal grammar that provktes a mold In which the chlW 
develops the grammar for her own language; and the chiW uses a set of universal 
rules that prescribe the ways she can organize the utterances she hears and 
evaluate the hypotheses she forms according to whether or not they help her to 
understand the utterance. These components of the language acquisition models 
are similar to the domain model, hypothesis-formation models, and evaluation 
measure of MAGE, respectively. 

Although MAGE borrows from linguistte models, this author does not 
necessarily endorse any of these models nor support these or any other lingulstfc 
theories. The augmented transition network mechanism discussed In this report is 
not related to these linguistic models, nor does this author claim that the ATN is a 
realistte model of human language comprehension. What this report does say about 
these ttieories of language acquisition Is that some aspects of the models can be 
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implemented as a computer program operating m a data structure representing an 
ATN grammar. 

The rest of this report is organized as follows: 

• Chapter 2 presents f^AGE's domain model and the particular aspects 
that make possible the evaluation measure. 

• Chapter 3 states the general grammatical inference problem, and 
presents the hypothesis-formation algorithm and evaluation measure 
used by MAGE in its p^ial solution to the related problem of 
grammatical extension. 

• Chapter 4 discusses further the domain model, hyptothesis-formation 
models, and evaluation measure in the cdhtext of language acquisition 
by children. 

• Chapter 5 describes the design and impten^tation of MAGE. 

• Chapter 6 contains a summary and cmidt^ons. 
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2. An ATN with Semantic Gatiegofies 

2.1 Machine Recognition of Hand-sent Morse Code 

The research was motivated by the re^l-wqild problem of automating the 
recognition and understanding of hand-sent Morse code In an amateur-radio 
network environment. Morse code const^ of ^Ne ©tempts: dots, dashes, mari^ 
spac^, lett^ spaces, aavi word spiK^es. The Er^l^ aN3h€^)et, digtts« and 
punctuation are encoded as groups of one to six marks (dots or da^ies) separated 
by mark spaces. These ^oups are s^jaraled from eacii other by letter spaces 
(ideally, three times as long as a mark space) arKl combined Into words, vnrWch are 
separated from eadi other by word spaces (ld«^ sev^^Bnfes as feng €« a mark 
space). For example, "SOS" istransmitledaiP^diJtms^JotmsLdotlsdash msdash 
ms dash Is dot ms dot ms dot ws", where "S" is encoded as "...", "O" as " -", "ms" 
means mark space, "Is" letter space, and "ws" word space. Morse code Is 
transmitted over radio by ^ort signals (dots) and k>ng ^nals (dashes), witfi the 
pauses in between signals serving as spaxxB. 

It is desirable to aut(Knate Vhe rec^ion of tt)ese signals and the transcription 
of the marks and spaces back into charact^ text, to iM'Oduce a ridable <xjtput 
However, th^e are many aspects of msyiuai Morse code that msd<e transoiptlon 
difficult, not only for a machine but also for a hum^ (^p&rai&r. Many eirors are 
introduced by radio attributes IRte transmitter chirp said atmospf^anc interference, 
and by send^ in-egularities includir^ ^>actng ^rors (e.g. a letter space that Is 
shorter ttian a nearby maik ^3ace), m&ik &j&rs (e.g. sending a da^ instead of tMK) 
dots) and spelling errors. Tlie result is emalogous to speech that is slurred or brd<en 
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by arbitrary pauses and includes a few mi^)ronounC6clw(E»^d8, 

Research in machine transcription of manual Morse ecxie began In the 1950's 
and included the development of MAIKDE (Mofse AUtomi^ DEcoder) [11] at M.l.T.'s 
Lincoln Laboratory. MAUDE and other early transcribers were based on a small set 
of statistical and linguistic rules; no attempt was made to take advantage of the 
constraints provided by radio network protocol or the iriformational content of the 
transmissions. 

Recently, a system called COMCO-1 (Computerized Morse Code Operator) 
[21] has been devetoped at M.l.T.'s Laboratory for Computer Science. It Involves a 
new perspective on the manual Morse code problem: it utiMzes extensive knowledge 
of the peculiarities of hand-sent Morse code aiwl amateur-radio network protocol, 
and attempts to 'understand' the Morse code conversaMcm. 

COMCO-1 consi^ of three components: a signal-processing system, a 
Morse-code-to-character-text transcriber, and a text understander, or parser. The 
signal-processing system produces a file of m»k and space durattons based on its 
analysis of radio signal. 

The transcriber, a software system caHed COh^DEC (Computerized Morse 
DECoder), converts marks and spaces to character text using a set ©f modules, each 
of which is an 'expert' on one aspect of transcription. Each module corrects certain 
types of errors and makes additions to a set erf suggested tr^iscriptipns, where each 
transcription consists of a list of vocabulary eleraents, COMPEG is akJed by 
dictionaries of ham-radio jargon and the English langui^e. 
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2.2 An ATN Parser for Morse Code Conversartions 

The pareer, <^led CATNIP (Cofnco-i ^jgmented Transtton Network 
Interfaced Parser) [16], uses an augmented transition network (ATN) grammar to 
evaluate the transcriptions suggested by COMDEC with respeict to their syntactte 
and semantic coherence and selects one «iat matches a path ttirough the ATN. The 
grammar includes a transition network thai represente the ^ntactic/semantlc 
structure of a Morse code conversation, and a set of registers, and functtons ttiat 
operate on them, designed to store inf omiation extrsKrted from a conversation. Both 
COMDEC and CATNIP are written mo^ in KMDLCtS), al»igh-level programming 
language of the LISP family. 

The conversations largely consist of a ^orttmnd iangudge called chatt&r. 
Network protocol and the Hnrtted vocabulary of clatter eonsbittn ttie possibi© to^cs 
of conversation to the statemerrt and qi^ry of oper^or identiflcatton, signal 
characteristics, rendezvous information, message. *afffc Information, and so fortti. 
The conversations are task oriented, and a pwser cai 'comprehend' the diatogue 
because both the topic of conversation and the movemert from topic to topk: te 
severely limited.* No formal definition or tens^age generator exists for this 
natural-language-W^e jargon, so the grammar was derived from several hours of 

franscripte. 

This grammar follovi® the ATN formalism d^cribed by Woods [24]; An 
augmented transitton network consists of two con^xMients: a transition n^work 
(TN), and a set of re gisters with associated functions. Airamition network is aset of 

* An example of a short but typical conversatton mat can be parsed by CATNIP is given in Appendix 
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'named' finite state machines, or subnetworks, viAwm a tramitton symt)ol may be ttie 
name of another (or the same) subnetwork. When the name of some subnetwork 
appears as one of the ^mbote of a ^ansition* it indicates a 'push' to that 
subnetwork, in the sense of calling a subroutine, A torwinsA^ate indicates a -pop' to 
the 'calling' transition, which may then be foHowed to#!est^ itiles^nates. When 
other words appear as transition symbc^, the p^s©* operates the subnetwork as a 
finite state machine, grttempting to 'accept the inpytssqu^ice. 

An ATN also includes a set of regrfs/eis deigned to hold contextual 
information, a set oltests that determine the validity of a word in a given context, and 
a set of actions to change the contents of the registers as the context shifts. A 
possibly empty set of tests and actions Is associated with each transition. When a 
transition symbol has been matched by one of the mechanisms described above, the 
transition may be followed only if eacfi of the tests can be passed. 

After the parser has been determined that a transition may be followed, each 
of the associated actions is applied Ijefore the parser continues processing from the 
next state. Actions are often used to build and connect parts of parse trees, which 
are saved in the registers until completed at the end of the parse, but this ability is 
not used by CATNIP. Augmented with registers, teste, and actions, a transition 
network has the power of a Turing machine. A more detailed discussion of 
augmented transition networks Is given by Ritchie [19]. 

CATNIP'S grammar conforms very closely to Woods' definition of an ATN, with 
two exceptions. The first is that CATNIP'S registers, and the tests and actions that 
act on them, were designed to manipulate the particular informational items that are 
expected to appear in chatter conversations, rather than to build parse trees for 
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legal sentences. These items inciude call-signs (names) and locations of operators; 
time and date; ratings of streng«?, tiiarity, ete.nei fi^jfwte; traffic infon»atlon like 
message number, length of message, and the me6sa^ body; and conversation 
history like pending questkM« and reque^s. 

This exception rlk^rsrtes one of the mo^ powerful fee^res of the augmented 
transition network model: the po^iblHty eidsis <rf adding to the model whatever 
facility Is needed and seems natural to <te the j<rt). An addition requires only a 
relaxation of the restrictkHis on the types of tests and actions but no reformuWIon of 
the base model. 

2.3 The Syntax of Chatter 

The second exception to the standard ATN is the unusual organization of 
CATNIP'S grammar into topical categories. Each of the nineteen subnetworks is 
designed to process a particular set of semantically related substrings. ATN 
knowledge bases for language processing are usually organized into subnetworks 
that process syntactic structures, such as 'noun phrase' and 'verb phrase' in 
English. A subnetwork begins processing a substring when it is referenced by a 
'push specification' (i.e. the name of the subnetwork) on a transition of a higher-level 
subnetwork. The push specification perfonns the dual role of expressing a top-down 
prediction that some particular kind of item is needed at that point in the Input 
Stream, and indicating which subnetwork is to be used to process the item. The 
suitability of a particular type of category (for example, 'noun phrase' Is a syntactte 
category) depends both on the ways that grammatical predictions can be phrased 
and on the classes of items that can be processed in a similar fashion (i.e. by the 
same subnetwork). 
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It has been suggested by Ritchie [19] that this 'subroutine' mechanism 
presupposes a syntactic organization of the gracwnar into subnetworks and that a 
semantic organization cooJd not be viable, "sine© semantic categories are not the 
appropriate organizational units for an augmented b-ansition network grammar." 
However, I have found that the addition of m^uiIng- based categories Is not only 
justified, but aiso superior to using only syntactic categories for embedded structure 
processing In the Morse code radio network domain. 

The chatter language Is sulftelen% limited, little syjitax exists, and what does 
exist is either weak or can be described In more revealing terms as a result of 
semantte considerations. The language consist of only four generic types of words: 
q-signs, pro-signs, call-signs, and abtimviations[Zl Q-signs are Internationally 
agreed-on abbreviations which were devised for raidiot^^raph use. Each q-sign 
represents a complete thought; e.g. "QSK" means "I can hear you between my 
signals; break in on my transmission" and "QTQ ?" means "Can you communicate 
with my station by means of the International Gk)de of Signals?" The first letter in 
every q-sIgn Is 'Q'. Pro-signs, or proc^iure signals, also jiave precise definitions but 
do fiot express complete thoughts and are ck«e|y rete^ to network protocol; for 
example, "AS" means "wait" or "stand by", and."AR" m^f^ "end of transmission". 
Call-signs are station Identifiers and serve as names of radio operators. The final 
category consists largely of simple abbreviations of commonly used English words 
and phrases; for example: "GA"' means "go ahead", ''NR" means "number", "OK" 
means "okay" and "PSE" means "please". The frequency of these English 
abbreviations Is so low that oi English- like syntax model coukJ not be developed tor 
chatter. 



18- 



There are two types of syntac^ rules. The first is characterized by the 
following example: If either of ttw constructs ''cdBsign DE callsign" ("Station 
<call-sign1>, this is station <caH-sidfi2>") 6r "DE csrtteign" ("This Is station 
<can-sign>''f occurs in a transmlsskjn, It occurs near the beginning of that 
transmission. A 'transmission' is equivalent ttj^ a •Sentence* In spoken conversation, 
and it does not necessarily Include everyftilngtmnsmttted by a Single operator 
between signals from other operators. 

The second type of syntactic rule Is the or^fer of the *ai'pimenfe' ttrat follow 
almost all q-slgns and many other words, e.&;«QSt K^G NR 3 ?" ("Gan you 
acknowledge receipt of message number threef?")«tfid "mZf^OCK ^00" ("You are 
being called by Rock on frequency 3.500 KHr"). The definition of each q-sign 
inclu(tes a set of informational '^ots' that should befWetf by the q-sign's arguments 
(for example, "QRZ" alone meam "You are b^ng called by -^ - - on frequency 

kHz"). However, "QRZ 3500 RCXDK" Is just asuieaninghjl as "QRZ ROGK 

3500", and the phrase may be transmuted b^h tways, so order Isn't really very 
Important here. It is clear from tfmse examples that thes^syntat^te rules can easHy 
be reformulated In terms of ^e uncterlying sema»i«C8. The only syntax rule that 
seems very strong is the fact that argumenfe aHwSys fi^jSSif the word of which they 
are argumerrts. 

The first two constructs discussed £*)Ove ar« different \A^iys of Identifying a 

new operator as she begins transmrts^on. EWier cari occtff In any posWon where 

self-identification of an operator Is d^iredr loglCfiWy thfe Is at ttie begaming of a 

^CATNIP'S grammar usrc the convwticm fhai mt viwd Ai lo««r-awe l^e« is a generic Mmn, 
which is replaced by an appropriate chatter word at parse-time. 
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transmission by that operator. The syntat^ rule is replaced by a more intuitive 
semantic rule that groups the two phrases r« Hw topical category "Identification of 
Operators", denoted ID-OP in the grammar. 

The number, type, and ordering of the argument words not only depend on the 
lexical features of the particular word of which they are arguments but also are a 
function of the context. For example, in the phrase "NR 1 GR 200 QTR 1500" 
("[message] number 1, with 200 groups, at 1500 hours"), "GR" Is followed by the 
number of English words or code-groups in the next message. However, In a 
transmission like "PSE RPT GR 10 , 20 , 30 OK ? K" ("Please repeat code-groups 
10, 20, and 30. Okay? Over"), the arguments of "GR" are one or more numbers 
separated by delimiters, referring to the previously sent code-groups in positions 
<number1>, <number2>, ,.., <numberN>. Thus the syntax of a word's arguments 
depends on the current topic of discussion. 

The potential of syntactic rules Is further weakened by the spoken -language 
aspects of chatter conversations, for example, the existence of noise words. These 
include chatter words from both the pro-sign and abbreviation categories - such as 
"R" ("roger"), a pro-sign, and "NW" ("now"), an abbreviation - that an operator 
often sends as 'filler' while she is deciding what to say next. So another syntactic 
rule might be that a noise word can appear anywhere In a transmission, except as 
the last word in that transmission. However, most potential noise words can also 
appear as meaningful words in various contexts, for example "R" might be the 
response to "QRO ?" ("Shall I Increase transmitter power?"). 

Noise words can appear at any time, because they are meaningless; this Is a 
semantic rather than syntactic consideration, so this rule may be reformulated as a 
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semantic rule that allows meaningless words to appear in ^y context and requires 
them to be disregarded by the infomiatton-aG«imute*ing laeohani^iis of tt»e parser. 

2.4 The Semantic Structure of Chatter Conversations 

Although the syntax of chatter is weak, there is a strong semantic structure 
Imposed on Morse code conversations by radio network protocol- First, the 
operators involved must establish contact with each other, and this is represented by 
the CONTACT subnetwork in the ATN. Next, one operator prepares to send some 
message, and then sends it, either as code-groups or English text; this Is 
represented by the TRAFFIC subnetwork. 

Immediately following the sending of traffic, the receiver may ask to have 
several words repeated and eventually acknowledges receipt of the message. This 
process is modeled by the REQ-INFO subnetwork. The TRAFFIC and REQ-INFO 
subnetworks are repeated until all operators have sent all their prepared messages. 
Then the operators t>egin signing off, which usually Involves negotiations regarding 
re-establishment of contact at some future time: this is represented by the 
END-CNCT subnetwork. At this point, the conversation may terminate, or one of the 
operators may continue by trying to establish contact with a new operator. 

With one major exception, these four toptes are the only possibilities for 
discussion and they always occur in this rigkj order. The exception is the 'Interrupt 
Subnetwork', denoted INTRUPT in CATNIP'S grammar, which can be pushed to 
(called) from any state and represents an Interruption In the smooth flow of 
transmission. The possible types of Interruptions include a third operator suddenly 
breaking in on a conversation; sudden static on the air waves, which must be dealt 
with by changes in transmitter characteristics or frequency; and so on. These 
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interruptions are very difficult to parse since the context Is made rrwaljd by the tweak, 
and this presents an interesting problem for the parser designer. However, to make 
the problem addressed in this report more tractable, I have ignored the 'interruption 
problem'. 

The four main areas of discourse are broken dow/n into additional subnetworics 
based on topical categories. For example, CONTACT has transitions Indicating 
pushing to (calling) the lower-level subnetworks ID-OP (identification of operators), 
NET- RELAY (relay of operator identification through the network controller), and 
QUAL-CNCT (discussion of signal characteristics). It Is only within these 
lowest-level subnetworks that syntactic structure shows up, for example, in the 
ordering of q-sign arguments, but, as discussed above, this structure results from 
semantic as well as syntactic cdnslderatlons. 

The semantic category of a push (call) specification fulfills its role as a 
top-down prediction that a particular topic will be discussed at that point In the 
conversation, and of course it indicates which subnetwork is to tje used to proems 
phrases discussing that topic. Semantic categories are more suitable for this 
application than syntactic categories due to the limited syntax of chatter and the 
strong protocol constraints on the discourse Structure of a conversation. 

The semantic organization of this ATN grammar not only is very unusual but 
also plays a unique role in the partial and lirhited solution to the grammatkjal 
inference problem discussed in the next chapter. 
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3. Grammaticaf Inference of ATNs 

3.1 The Grammatical-Inference Problem 

Scientists have been using formal linguistics for modeling natural and 
programming languages for over twenty years [14]. Qrammars have been employed 
to describe the syntax of languages like chatter arxi can be used to characterize a 
syntactic source that generates all the sentences in a language. It would be useful if 
the grammar could be directly inferred from a set of sample sentences in the 
language in question. The process of deriving a grammar from a set of examples Is 
ceiled grammatical inference. 

The general grammatical-inference problem Is simply stated. Assume the 
existence of a source that generates strings of the form x = aiag-an, where x is a 
sentence in a lar^uage L and each aj is a word in the lexicon of L. L is assumed to 
possess some unique structural features that can be modeled by a grammar G. The 
grammatical-inference machine is given a finite set S* of sentences that are in L, 
and possibly another finite set S' of sentences that are not in L. Using this 
information, the machine must Infer the syntactic rules of the unknown grammar G. 

The first difficulty encountered is the necessity of obtaining extra information 
in order to find an appropriate set S". Although the set S* can be obtained from the 
source, the set S' can be defined only if an external teacher, who knows something 
about the properties of G, is ayailab)^. Unforturiately, without S', the 
grammatical-inference problem Is unsolvable except for a small number of highly 
constrained grammars [8]. The chatter language has this problem, because, with no 
formal definition, there is also no algorithmic means for determining that a given 
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String of chatter words Is not Hkely to be transmitted over Morse code networks, or 
even for deciding whether a given word (that is not a ij-signy Is In the chatter 
vocabulary. 

Even though It Is Impo^ible for a grammatical- Inference machine to find 
exactly one grammar for most languages without this negative Information, It is often 
possible to enumerate a large set of possible jammers and then narrow down the 
solution In some way to a single grammar. A grammar is 'possible' in this sense If it 
accepts the sample. The problem of narrowing down the state-space to one 
grammar has been solved for regular languages, the very simple languages that can 
be generated by regular expressions and accef^ed by imite state machines (FSMs). 

The limited case of regular languages is solvable because tvw3 finite state 
machine grammars that generate the same language are equivalent. Since all of the 
accurately enumerated grammars are equivalent, only one need be constructed, and 
It Is the correct solution. FeWman et al. discuss the concert Involved {12], and two 
algorithms are presented by Biermann and Feldman [3]. 

However, these algorithms cmnot be utilized to extend tf»e grammar for 
chatter, since the nesting features of natewal language are not adequately 
represented by finite state machine grammars. Chatter can be a»TSidered a natural 
language, because Its representation requires nested structures, whteh are 
represented by the subne^orks of the ATN knowledge krase,^ ar«* because It Is an 
evolving, 'spoken' language."* 

'^Section 2.4 

^he similarities between chatter and nartural langoages 1^ Eojtffeh are discussed further In 
Section 4.3. 
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The context-free languages are more power^l t>an regular l£tfiguages, 
because they can model the self-embedding and nesting pi^operties of natursd and 
programming languages. They can be represented by grammars whose pr^uction 
rules are of the form A -> a, where A te a single nonrtemwial symbol and a is a string 
of terminal symbols and nonterminal syn^jols [t]. A fewi/na/ $ym6o/ is an element 
from the language being modeled. Since the left-hand side of the rule contains a 
single symbol, no conte)rt is necessary lo detemwje the^deriv«rtion of a sentence. 
Context-free languages sre accepted by transition fl^worte (TN^. 

It Is considerably more difficult to derive grammars for context-free languages 
than for regular langue^es, becau^ an in^ttenun^3t^<^|»p^li3>le gr^unmarscan be 
enumerated for eaiy set of data. No algoWwi exi^ that can dedde w^^her two 
arbitrary context-free grammars accept *e sae^ Ian|^;^l0e* Some mecfwaii^ 
needed that limits the number di grammars produced to? a frac^We tevel and then 
selects one of them tf«rt is 'best'. Sucf» a mechaniwi Is termed em 'evaluation 
measure'. 

One approach to solving this problem is^ look lor a reasonably go([^ fa, with 
some suitable definition of 'reasonsrt^, rather llia» trying to find a grammar Ihat 
generates exactly the input sanq^ie. Cook^ate&[7j«i^ an i^inite UMiguage, i.e. any 
language tha* includes an infinite nun^jer of sen^noes, assures a discrepancy 
between a grmnmar inferred from a f«iite sample mrdlSm^&mm fo^ the teuig^aiQe. 
He used a cost function measuring the tradeoff betweehclecr'ease in <x»nplexi*y and 
increase in discrepancy to bound his machine's search-space. The macl^ie 
descrHDed by Wharton {22] uses a simiter«»ali»tion measure, ^t it rec^ves its 
examples via a multi-step method rather than all at once; this methodology tends to 
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increase the efficiency of enumeration but cannot guarantee minimum complexity In 
the ultimate result. 

Another approach is to require a huma»i 'teacher' to guide the grammatical 
inference machine as it enumerates possible grammars and select the 'best' 
grammar according to some sul^ectlve meeeure. In the sc^ario developed by 
Knobe and Knobe[17], the- teacher is a knowledgeable person who provides 
individual examples in optimal order w^lth optimal variety, and who can recognize 
grammatical and ungrammatical strings without knowing the formal grammar for the 
language. The machine enumerates first general and then more specific 
productions, and each production is tested by the t^«;her as it is enumerated. The 
machine retains the most general rule that does not produce any strings ruled illegal 
by the teacher. This scenarto places a hea^ burden on tiie teacher to present an 
adequate 'course'. 

A third approach, described by Grespi-Reghizzi[^, attaches structural 
descriptions to the exgonples. This limits the number of hypotheses that are 
compatible with the data and thus reduces the er»umeratlon problem. The extra 
information, although similar to the type of information required by the 
complexity/discrepancy measure and the teacher's judgements above, must be 
justified, since it departs from the standard model of grammar acquisition. 
Crespi-Reghizzi explains that this structural information is similar to the stress and 
intonational information available to a child acquiring a natural language, and that 
the widespread belief that there must be a partially semantic basis for the acquisition 
of syntax implies the availability of some structural information to the learner of a 
language. Of course, the availability of structure vastly reduces the number of 
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altemative possible grammars and assures thirt ttie acquired grammar gerterates 
sentences with structures consistent with their meaning. 

The grammatic€rt-lnference macMneo described above are aH succes^ul for 
subsets of the contexMree languages. However* mere teas yet no algorithm ttiat ean 
infer the complete set of rewriting rules from a f»s^e smnpie of an arbitr^v 
context-free language [6]. tt Is not surprising mirt no general mechanism has been 
developed for grammatical inference of supera^s of the context-free languages, 
particularly the context-sensitive lar^uages, ¥»bto*i4ncta<te aH rraitorsri langus^es. 

Now, the context-eensltrve are even more powerful than the context-free 
languages. They can be repr^ented by grammffl^ vwlh^ocfaMrtion rules of the form 
a --> b, where both a and b consist of any number of terminal and nenteiminal 
symbols; the length <^ a rtiust be lees than or eivM»*ollietef^tti of bi^]. SNice the 
left-hand side of a rule may include more than one symbol, context '» necessaiy to 
determine the derivation of a sentence* The con^xt-saisitive languages are 
accepted by augmented transition n^works (ATf^te). ^ natural language are 
members of the set of context-sensitive language® cof^xtual InfomaAion Is 
necessary to parse constructs such as fi^lexfves aid vete^ie c^^uses mi Engfeh« 

3.2 Grammatical Inference and MAGE 

This report describes a grammatical-extension machine for an augmented 
transition network grammar for a very limited 'natural' language. Since augmented 
transition networks represent and are equivalent to the context-sensitive grammar, 
the development of MAGE is a small step toward a general solution to the very 
difficult problem of Inference for context-sensitive grammars. 

There are three ways in which this machine's model of thie 
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grammatlcal-inferenee problem diverges from the standard model diseussed In the 
first section of tWs chapter. The first is thart MAGE'S gmmmar is not inferred from 
scratch but builds on a core grstftimar, M^lch Include a smsJI transition network, a 
set of pre-coded functions for the tests and acfions, amd a dictionary of q-signs (but 
not other chatter words). 

The second difference is that the gramma' Is extended Incrementally; that Is, 
each example is successfuHy leeB-ned before the next example is provided. This 
makes the inference problem more diffteult than usual, becau^ MAGE cannot 
exploit structural simllarlttes between axamptes when determining the embedded 
structure of the grammar. Jfm inereroentel ieeimmM necessary In the K4orse code 
domain, because a structurally complete sample Is required in order to derive a 
complete grammar for any language [6]; a positive sample of a language is 
structurally complete if each rewriting rule of the grammar is used at least once In 
the generation of the sample. It is impossible to generate a structurally complete 
sample of chatter, because no formal grammar exists, and the language Is 
continuously evolving. In other words, since the grammar can never be complete, 
the extension mechanism must always be ready to add one more example to the 
grammar. 

The third difference is a result of the second: the extension procedure is not 
expected to result in an exact grammar for the language that is equivalent to some 
known fonnal definition. The best that the system can do, given the constraints of the 
domain, is to generate an extended grammar that understands all sentences It 
received as examples, plus a large numb®" of simlter ^niencM. 

Keeping in mind these deviations from the standard grammatical- inference 
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model, the computer program the author has (teveloped is successful at what A tries 
to do: extend an augmented transitieiT network gtammar for the limited Morse^ode 
domain. MAQE is ar\ enumerative procecfc^ e in the sense ttiat it consid^^ many 
potential additions to the ^r&mnm for eadi exanple tt is supplied. However, the 
evaluation measure guides the enumeration of possible exterffiions, and ?ea^ 
enumerated extension is selected or reJectedbelCNB the next extension is 
postulated. As soon as one extension has been approMSd, tfie enumerartion process 
halts. Thus all but the ultimate result »er^ec^d before wr/ data structure is 
generated. Since only <me 'j^tysical' gr»nn^ ^iste at any point in time, and 
extensions result In physical attera^@ns of^us^ita strtQctur^tim program may be 
consictered a consffucf/i^ nwdel. 

3.3 Hypothesis Formation and Selection 

There are two phases to the hypothesis-formation/hypothesis-evaluation 
process. The first is the selection of a structural extension to the transition network, 
to result in a grammar that can accept the current example. The second Is tiie 
specification of a set of tests and actions to be attached to ttie transition network, to 
enable the parser to understand the current example. These processes are 
independent and sequential, and they are presented here s^arateiy. 

MAGE operates on a transition network grammar (it ignores the tests and 

actions during this phase) consisting of thirteen topk:atly categorized subnetworks. 

Given an example tr ansmission, or an example conversation containing one or more 

^he 'Interrupt Subnetwork' and the five related subnetworks of CATNIP'S grammar are not part (rf 
MAGE'S core grammar, becatee ttie euttem versicm of MAQ£ doe* uptxtearf nwWi ttie Nv^^^alion 
prot)iem. 
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speaker changes,^ the program first deterniH^s if tNB ex^nple is already accepted 
by the grammar, by attempting to parse it MAGE tries to match the example to the 
grammar using a standard Iransitton netowork pambig arfgerithm, with owe deviation: 
rather than requiring a singte start-state, the parser performs a depth^first search 
from s^erai potential ^art states in(*jdifig^ aW sli^t that can precede tlie 
beginning of a transmission. An example shoukJ not loeQm m mid-transmission, 
ertthough the program can handle this in «ome stances. The parse is 
nondeterministic, i.e., conceptually it f^lows many p^is in parcJIel (although it 
actually uses a depth-first seare*i), because tte,gr«fiRiai? rmymrAam more than one 
subnetwork representing the same subsequengeefrteliepsof m>r(is, as do m»iy TN 
and ATN gneunmars. 

If the example is already accepted by the #ram«iiiri the program prints an 
approprigtfe menage and a^s for anottier example. # the^^ir^ word or words of the 
example are accepted by one or more subnetworks* feut ttie foitowteg word does not 
match any trsrisition teavmg the test s^ile of any of tliese partial paths, the 
hypothesfe-formation procedure takes conto^t^ ii«th powiters to the 
iast-matched-states' and the next word in the ej^a^rtt; The same sequence of 
words may be accepted t^ more than one ^te^wori^, because the f^rse has 
multiple start-states and the grammar is inheieii^ j^sndetermii^ic. If the first word 
of an example Is ntA accepted by any transitiQ»ileawfiH3 ijny stort-state, ttie set of 
'test-matched-states' in this caffi» consi^ erf the possibte start-^tes discussed 
abON^, ar«i the next vwyd in the example te^e itfs^ or^. 

^A "speaker charge' occurs ma Morse-code Gonvef^ation wN« one operator ceasM transmWing 
Morse code and another begins. 
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At some point in the example, marking the end o# the new phrase, the words of 
the example resume matching the symltols on ttie trai^Wons of the TN. This may 
happen at more than one sUs^, for the reasons ste^ above. If the new phrase Isal 
the end of the example, it matches any terminal steie to the transition network by 
default. The state(s) containing the tr»isMof^sV where the path regimes md the 
terminal state{s) matched by default arfe called ttie^end-of-plwase' states. The task 
now is to add some structoral representation of the woiCte between these matches 
(the new phrase> to ttie transi«on network con^)0iwi*<rfH« grammar. MAGE uses 
the models presented below t&iKComf^sh flife objecflve. 

The set of models represents alt singte*tra^Won ®(tef«sions to the general 
three-state finite state machine shown in Figure 1, with several exceptions: it Is 
undesirable to return to a ^art-state from some ottierstfite in ttieaitm^work except 
in a small number of prescribed circuffmtanoe^ it isiar^rat^ for a subnetwork to 
contain a tennlrert state, sa^ then r&pem thfr entire 8u*)netvw)ri<, rather ttian return 
from that state to the start-state. A single subnrtvi/ork vn^UHtf tests and actions is an 
FSM. Model (Figure 1) represente the origtoal ^attm of a subnet¥W>ri<: the circle 
containing S is a start-state; the circle with the d£M4c^)ed area Is a terminal state; ttie 
single intemiediate state represent the mbWarii^ com^x web of states and 
transitions between the start-state aid a terminal state in an actual subnetwork. 

Each of the models Illustrate in Fifpres 2 thrmigh 8 n^weserts a general 
one-transition extensfon to model 0. All a^nsionsttiat areposstole, considering ttie 
chatter domain, are Included in ^is set. Any of the^ire© circles in« these flKsdelatti^ 
con-espond to the original circles in Model ii»y iBpresent a •test-matched-state' 
and any terminal state may represent the 'end-of-phrase', depending on the 
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particular model and circumstances; a circle other than the original three always 
represents a new circle added to the subnetwork as part of the extension. Since 
each example is expected to include only QQg new phrase, only one type of 
extension is actually used for each example. However, the new phrase generally 
consists of more than the single word that cati be attached to a single transition. The 
transition can be viewed as modeling a string of transition/next-state pairs, with the 
first transition in this string leavinga state in^ pilgyif^E^h%(^f^^)^^^ ^ shown In the 
model, and the final transition connected to the next state shown in the particular 
model. 

© <D <!> 

[MODEL 3 



Figure t:^40detO 
General sutxietwork 
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Z MODEL n 

Figure 2: Mod^ 1 
'Last-matched -state' becomes terminal state 
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Figure 3: Model 2 

A terminal state that Is also a 'teet^iiiatehed-atete' 

becomes a p(^ble intermediate slate 



& 
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Figure 4a: Model 3 

State(s) inserted parallel to transition between adjacent states, 

which are Mast-matched-state* and 'end of -phrase' 
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Figure 4b: A ^^ecMat cii|e of Model 3 
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Figure 5: JModfl^ 

Transition loops to ^me sfate, which is both 

'last-matched-state'ipd 'end-of-phrase' 
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Figure 6a: Model 5 

Transition returns from 'l£3t-matched-^ate' 

to previously visited ^ate 
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Figure 6b: A special case of Model 5 
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Figure 7a: Model 6 
Completely new path is formed in subnetwork 
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Figure 7b: A special case of Model 6 
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Figure 8: Model 7 

New transition added between 'last- matched -state' 

and new terminal state 

MAGE compares each model to each last-matched-state/end-of-phrase pair. 
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The hypothesls-formatiorr procedure enumerates a set of model/pair combinations 
called 'templates', matching particular states In the model to the lastmatched-state 
and end-of -example of the pair. The firet component of the evaluation measure 
guides this process, restricting it to enumerating only |hose models that provide a 
means for accepting the new phrase in the finite state msKitik^ sense: the first word 
in the phrase match^ some symbol attached to atransition leaving the start-state of 
the extension derived from the model, the second wonj matches some transition 
leaving the state pointed to by the transition for the first word, and so on. The state 
pointed to by the transition rnatcNng the last word in the new phrase must either be 
a terminal state or contain a transition that matches the first word in the rest of the 
example, which follows the new phrase. 

If there Is only a single last-matched-state, and only one of the above models 
provides a mechanism for accepting the new phrase, then this model is subjected 
Immediately to the second component of the evaluation procedure. If this model 
also provides a mechanism for acc^iqg the new phrase In the context of the 
cun^ent example, i.e., the entire example wouW be accepted by thfe core grammar 
plus this extension, then the evaluatTon is said to 'succeed', and the extension is 
physically added to the data structure representing the transition network 
component of the grammar. In this case, the test/action phase of the 
hypothesis-formation mechanism begins operation. If the evaluation fails, the 
example is rejected as unlearnable/ 

If the structure of the example matches one of the atx)ve models, but there are 

^The author has not teurKi any actu^ tranBrnteslonsmi* Contain phiaaes ttet<ause MAGE to faH, 
with the exception of transmissions contertnlBg me of tfw^r^wtiirtions dtecus^d in Section 2.4. 
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several last-matched-states, then the evaluation measure selects the first of these 
states that passes its criteria. This selection is justified, because in nearly all 
instances the first passing state is the only one: conflicts are prevented by a strict 
ordering of the start-states via the subnetwork in which each appears. 

There are several situations in which more than one model is represented in 
the templates produced by the hypothesis-formation process, and in these cases the 
evaluation measure must select a model as well as a particular state pair. Consider 
the example "QSA 5 NW QTC K" ("The strength of your signals is excellent now. I 
have messages for you") diagrammed below, and assume that "QSA 5 QTC K" 
("The strength of your signals is excellent. I have messages for you") is already 
accepted by the grammar. It is not clear during the hypothesis-formation stage 
whether to apply model 2 (Figure 9a) or model 3 (Figure 9b). Therefore, both of 
these possibilities are passed to the evaluation measure, which chooses between 
them on the basis of which transmission 'makes sense'. 

CEXAMPLE] 

Figure 9a: Model 2 applied to example 
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Figure 9b: Model 3 applied to example 

In this case, model 2 wins, because the evaluation measure decides that "NW" 
refers to "QSA" rather than to "QTC". "QSA 5 NW" ("The strength of your signals 
is excellent now") is a plausible update to an earlier transmission like "QSA 1 " ("The 
strength of your signals is very poor").® However, *•[#/ QTC" ("Now I have 
messages for you") would not make sense unless the operator had previoimly 
transmitted something like "Wait. I will have messages for you soon": this statement 
cannot be mad^ with the phrases contained In the core grammar. Of course, the 
extended grammar still accepts "QSA 5 QTC K" because the terminal state following 
the generic token " # " is not deleted. In fact, nothing is ever deleted from the core 
grammar; the only alterations performed by MAGE are additkHis. 

After a specific structural hypothesis has been ^tected by the evaluation 
measure, the machine enters Its second hypothesis-formation phase and 



\ "C^A 5" is accepted by the core grammar, "QSA 1" is also, since the generic token '#' 
matches any nurT4)er. 
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enumerates a set of potential test and action specif tcatte»Wtfor^i6h Jransi^n of IN 
new extension. If the symbol on anew trmsl^nis aqt8i0i,ttK>aesiQtton8 associated 
wWi q-signs are enumer^ed; iK»ie c^tfieprepafieiile^s^ should be associated with 
transitions whose synrtKris are q-signs. The q-aign actl«is put informirtion conveyed 

by q-signs and thelrargumaitsin cer^wi regte*€»8j 

<quality-of-cqntact> 
<pendihg-questions> 
<expected-actions> 
<general-situation-description>. 

If the transition symbol is some other type of word, but not a 'noise' word, the 

entire set of non-q-sign actions is enumerated. These actions put information In 

other registers, including: 

<information-ab6ui-r0(iefs/in^-6p6rator> 

<infofm0tionral3i<mt:^mdmg-pper9tqr> 

<id-number-of-mes$age> 

<fnjint)9r'<)f^MQfdS'ifirm^9^§»^ 
<number-of-words-received-so-far-in-me$sage> 

and others described in Appendix III. 
If the symbotis a generic toNenvi^-*'caH8ign",'!aBy",'»#", "cteKm-', qr "kicatlon", 
the entire set of %es^ is piesed to ttie GKiirttJatoHi ifioaaufm Thest tei^ serve as 
flHers to ensure that the chatter word tt«it AH^hes ai generte token is reaso^rt^^n 
Conle>d, to prevent every random word frora^nic^hfeig '*8ny''s lor exam^ste, sine© this 
symbol is intended to malch only codergroupstK BigKsbjwfOFds In a messa^body. 
The tests and actions to be associated with the new trarwltion(s) are now selected by 
the«valuatloh measure. 
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3.4 A Unique Evatudtion Measure 

The evafuatlon component of MA6E is r«*h«r imusuai in that it does not 
incorporate a cost function or other complexlty-r^arted ^©nsideration to select ^e 
•best' from among the set of possible strycturai ^(tensions or test/action 
specifications, nor does it use some subje^lve nw^sureL produced by a human 
teacher. Instead, the evaluation measure is bs^ed on the soniahtlc organization of 

Q 

the augmented transition network grammar. 

The criterion for selecting a structural extension is simply: "Will this structural 
extension place the new phrase in the correct topical context?". Similarly, the 
criterion for augmenting a transition with a particular test or set of tests Is: "Will this 
test or set of tests ensure that all words accepted by this traosltion are meaningful In 
the current context?". An action or set of asitioHstedpfjfc^lbra transition If those 
actions will select and save the important informiffiort contained in the phr^e and 
ignore any meaningless words. 

Tho first cr^rlon is fairly simple to Implement for phrases eontelning at least 
one q-sign, because all q-signs are assoclMed a priori w^ appropriate topics 
represented by subnetwor1<s. There are usi^ly two orthtseesubnetworHs In which a 
particular q-^n might make sense, but ttiftcor^xt of the rest of the example 
provfldes enough informertion to unk^uely c^tecmtoie *« topfeal category of the 

phrase. 

Those phrases that contain neither q-signs, nor other wonte that are known to 
be synonymous with a particular q-sign (e.g. "RPT" ("repeat") is synonymous with 



^Section 2.4 
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"Q&M" ("Pleeee repeat - - -")), are more difficult to evaluate. When an example 
contains an unknown word, MAGE asks the user if itjs a%««)nym of any known word 
and, if so, which one.^° Either the new word has a known synonym, or one or more of 
the other words in the phrase have known raei^ir^ tf^ ca^ be used to determine 
the meaning and topic of unknown words; ttiislopifs^rieli^tion is used to place the 
new plwase in the appropriate ajt>n^work 0,e* i^pntext). 

The selec«on of tests and actior^ proceeds ^teng sif«ijlar lines. Most tests are 
attached only to traMisitions with a generic symbol; most actions are attached to 
transittons with the symbol "new-speaker" fd€»K>ttr^ a ip<^gs«f cfpige), a 0^"^!?? 
symbol, or a q-sign. In addition, the evalualion^^flae^iJfe may attaph actions to most 
symbols in the REQ-RPT subnetwork (request for somfthtPft;*P *>6 ''^®^®^ ^^ 
response to request) even though they wa^ nf< generated during the 
hypoth^ts-formation phase; it is desiral^ to stor^ any repeat r^uest until it h^ 
been answered, regardless <^ how the nequ©^ was phri^eci, Thisis one of the marry 
Semantic oonskterattof^^alt with by iBiee^l**#i<p B?#^^ 

The likely number and lyp© of afgusiente for each q-sign are part of the 
machine's domain model, and they can be loc^^ed MpM%|aMa* This knpwl^ge is 
used to attach actkM^ to the fra«isitk«ss of qsisri^ifipiinfn^fhatconveyinformati^^ 
thatshouW be stored «i some regi^, fns©w#ca»i^,lK#ffi^^, aq^sigm may appear 
with a totally unexpected set of argumer^, and #*e appropriate actions must be 
fciferred from knowlei^e about the argui^nts §iwns#lves. The generiq tokens 
"call-sign" and "any" appear in^onfy a ^oatt fi*<wbw <^ context (tt^ ip-QP and 



^^User-machine interaction is discussed furthefmSeeti^B 5^2.1. 



42- 



NET- RELAY subnetworks and ^ q-sign arguments, and the MESS AG sn6 
REQ-RPT subnetworks, respectively), so their tests and aettons can be effe^ively 

pre-programmed. 

The major difficulty is with thegeoerte token "#", which can appea- In almost 
any context and almost always has some lnni>ortant mearang. Fortunately, " # " Is 
often preceded by some other word with asaJCfetedtest/aetioninform^iOR that can 
be transfen-ed to its argument. But in many csBes fliere Is no w&f/ of obtaining this 
information except to compare the particulo- i»e of "#" with its appeo-ano© 
etsewhere in the grammar, and to borrow the a<^3n8 associated with ttie closest fit. 
this method is actually very successhW at setecttng th»s«rae set of actions that I 
would have selected by hand. 

After this component of the evaluatton measure has approved a set of 
test/action specifjcations for each tran^lon in the previously selected structural 
extension, the specifications are attac»»ed te^^ »»© ©xtensiOR and fiie data sfructure 
representing the ATN gnmwna- is le»entianert*ly* altered^^^^ 
extension. The addition is permanent fth tti© senw vlhstf it can nc»v aid in a future 
tjootstrap process as (Jtescribed above. 

The iise of semantic informattcm by M^«i^» evalosttion measune is similar to 
Crespi-Reghizzi's use of structural infownatlonf^ lor the inference of con^a-free 
grammars.^^ The major dWerence is ^la* }Cee8pWfe#ii2^ »«^o^s a compW© 
structural description with eaii of *«s e«awf}les.*4Aa£ requires an©logaus 
information; however, alt sermmlfe/s^actic aiwcteini l« pre-progriwimed if*3 the 



llr.: 



Discussed in the second section x^ iMs^ dia^ptar. 
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domain model of MAGE, and the program itself selects the structural information, 
which includes meaning and topic in this context, that should be associated with 
each example. 

The use of semantics to construct and evaluate extensions to a grammar is 
also related to some proposed linguistic models of human language acquisition. The 
viewpoint that considers MAGE an implementation of these models is discussed In 
the next chapter. A sample learning session with MAGE is given in Appendix 11. 



44- 



4. Acquisition of Language and Grammatical 
Extension 

4.1 A Model of Language Acquisition 

There has always been considerable debate among linguists about the 
process by which children acquire their native Janguage. Most models represent 
language learning as an active process of h^fjothesis-formatfion and 
hypothesis-testing: the child continually formulates hypotheses about the language 
she hears and tests them by attempting to use them to understand speech and to 
construct her own sentences. The child is not initially presented with the entire 
language but with a small subset of the vocabulary and syntax which gradually 
expands as her competence increases [22]. 

According to a model discussed by Dale [9], a hypothesis is confirmed if it 
accounts for the data already available and successfully predicts future sentences; 
otherwise It fails. However, a verdict of success or failure is according to the child's 
perceptions of language, not an adult's. A grammar that generates the sentence 
"Shoes on" would be unacceptable to an adult, yet it is considered successful by the 
two-year-old child who hears "Put your shoes on" as "Shoes on". Hypotheses thus 
confirmed become part of the evolving grammar used by the child. This grammar Is 
descriptively adequate, which means it makes 'accurate' predictions about the 
correctness or deviance of sentences that the child has never heard before, as well 
as being observationally adequate, which means It accounts for all the sentences 
that have already been heard. 

According to a similar model developed by Chomsky [4, 5], not only does the 
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observationaify adequate grammar account fo^^ieobsewvect sentences in the sen$e 
of recognizing their stfucturar organisation, buial8»»iTM«eesitipos8it)ie lor the child 
to understand the meaning of these sentenee8*%ik©wi8©t*h©desGriptivehr adequate 
grammar is capable of undenstandlhg Infinltely^^ many 8©nto*G@s ttiat the child has 

never heard. 

This modelmal^es the controversial propdsWOn that #t© child may /enow about 
certain aspects of language: ^ome knowfedge te^wiate^ awdltee^iUdia^d not ^m 
these aspects in the usual sense. T^ese irm^e asp^^ # tangua^^^ (^ed tttd 
universal grammar and, according to the mod^, form ttie basis for tiypothesls 
formation arKl evaluJ^Orr. 

Ch6msky*8 model Is founded in the rar/dmrtffif-a^i^l of tin^M^e thduf^t, 
which states that the structure of latf^uag# 16 t® a iDO^sicter^rie d^ree specked 
btologtcaWy, and the function of eKperienc# i» to iie#««©^ris1nrt£rte*oipa©^ and 
turn it into llfigt^tlc competence [431 Th©jWtkma9l8*«?latei8 that a great deal of 
psychological structure \i Wnate and ttTM^#**Wietn ehlkllias a specific, and strong, 
capacity for language. These ideas are ^iliported b^ ttie spectes-^peclfic ««d 
species-uniform attributes Of language, !<©. a« fuiiwstts <nd oiriy humans use 
language, and by the surjM-islngly »nall degree of dWietiJty a «*iW has with the 
general mechanisms of lan^ac^: the noflow Of a s«il«iBepttie^^tab«Shment ©f 
word classes and rules fbr combifW^th^m, wid solot#>» » 

Ttie rationafist theory |X)i^feites ttie l^isn^ft^ ol a i»*wersii gr ffirimar, such 
that a successful model Of a univ^sal^a«*mar vi^uld inc*j«Je ewtctly #»c»e features 
of language that children do not have to ^eam arte* «^irfd exclude all the unique 
features of their particular languages that t^iilcfiiei^ft^st^«cquire from the speech 
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they hear. It is a system of principles that categorizes the class of possible grammars 
by specifying how partiQ*teH' grammars £ffe erfl«f*Eed, how the different rules of 
these components are constructed, and how they interact. 

The theory proposes two types of untversaKf^^tu^: swbslan^ve and formal. 
The set of substantive rules includes claims that items of a particular kind in any 
language must be drawn from a fixed eli®s of items. The formal iingulstic universals 
include more abstract conditions involving ttie character of the rules that appear in 
granmars, corKKtions imposed on these rules, and ttie ways in which they are 
Interconnected. For emmpke, every human language utilizes the same basic 
grammatical categories (substantive) - sentences, noun phrases, verb phrases, etc. 
-■ and uses Ih© same graawriatical relations among these ea^prles (formal) " 
sul^t and predicate, verb and obiect, ^j. [1^, 

According to these models, the task of the child acquiring a language is to 
choose fromamor^ihoae grammarsaMowed by the principles of ttfuversal grammar 
that gr»nmar that is c(Mnp2rtlbte witt» the tentted and imperteGt data available to her. 
The chi W is taced with a finite set of utterances, mariy <rf them ungrammatical (due to 
slips of the toijgue, false sterts,miemofy Iap8^, etc.), that she has heard from her 
parents and other people in her environment From t»me utterances, she must 
deduce ^le underlying nrfOTm order to i*se her languag®- 

The concept of a restrictive, univei^alffn^ for grammatical development Is 
supported by the simtoritles observed by Q^M between tee early speech of 
chikJren in dffferent cultures learning xwdely div^'gent lai>guages. According to his 
observations, a chiW's ^iest grammar usually krcludes a two-word syntactic 
structure with tv»«) ctesses of words, p/Vor and open. The pivot class is small and 
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each word in it is used with many different words from the much larger open class. 
For example, an English-speaking child might say "bandage on", "blanket on", "fix 
on", etc. For this child on is a pivot word; it is always used in the second position and 
many other words can occur with it. Or the child might say "allgone shoe", "allgone 
lettuce", "allgone outside", and others; here "allgone" is a pivot that always occurs 
in the first position. A pivot word may be the first or the second element in two-word 
utterances, but each pivot word has its own fixed position. 

As the child grows older and has more experience with her language, she 
begins to use three-word sentences that are simply pivot-open sentences with an 
additional word. Agent-object and agent-action constructions merge into the more 
complex but more meaningful agent-action-object construct. Eventually the child 
develops the concepts of noun phrase, verb phrase, and all the other complex 
syntactic structures of the English (or other natural) language. 

Although MAGE borrows from these theories of language acquisition and 
universal grammar, this report is not related to the controversy surrounding these 
models and rationalist theory in general. The author does not intend the analogy 
between MAGE and these models (presented below) as an endorsement of any 
linguistic theories; the analogy is provided as a vehicle for putting in perspective the 
mechanisms used by MAGE. It may be useful to consider MAGE an implementation 
of some aspects of these models. 

Although an ATN grammar comprises a large portion of MAGE's 'universal 
grammar', the author does not believe that the augmented transition network 
formalism is in any way related to the internal organization of the child's grammar. 
Dresher and Hornstein [10] describe the claims of some linguists that experimental 
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evldence supports the vtew that ttii ATN modeHs a psyic^totogieatty resrfistlc mode* of 
certain aspects of human Imgi^ic coitHjrehwi^onsrSeeBhefvMornstein, and many 
other linguists disagree. TWsrepwt is not relaled^tottie^ ^balsft. 

Throughout the rest of this chiipter/the Jeon ihecMd' fefofis to the hufl^n 
language acquisltiori mechanisrm posteilated in ttte fjrOpbaed lingi|jstlc models 
discussed above. The author does ncrt olirim tl^lhegcamfi^i^pal ^cftKision process 
implemented as MAGE is In any \*^ reteteel to^ /^^fill^ktofrt, or the tinko^wn 
process^ through Which tf^ team languaga 

4.2 The 'Universal Grammar; of MAGE 

Several aspects of these models are 'imptemented' as components of the 
grammatical extension machine: MAGE forms hypotheses that attempt to account 
for the data it receives. The hypotheses are derived from the program 'universal 
grammar', which consists of knowledge of the domain and the properties of tfie 
grammar it is extending. The kinds of hypotfieses that MAGE can formulate are 
constrained by the set of general extension models, or 'universal rules', presented in 
Section 3.3. MAQE tests each hypothesis by determining whether it is adequate to 
'understand' the example that motivated it. If a hypothesis is inadequate, another 
hypothesis is fomiulated and tested until the program has found an extension that 
enables it to parse the exampte.^^ 

The domain knowledge of MAGE is very similar to the model of a universal 
grammar presented above. Although the program might be presented with samples 
from any of a varie ty of 'dialects' of chatter (e.g. ham radio, military, diptomatic, 
^ ^This process is described m detaui in Sections 3.3 and Z.A. 
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shipping), the extended grammar will contoim to tli© iwweisete of the radio domain 
and of ttie augmented ^rgnnsition networlt re^jres^itertfon 1<^ th# CM'Amniar. 

The radio-domain unjvereals lndyd» ll» isliiHSteHiai :i»jf»^^nte such aa 
networl< protoc^rt, whieh limits the typea olsAiiiMjs Jtottt caw be 'said' (^WWQ 
conversations, and resute in the rigid topical bfei*d^«n of the ATM into the 
CONTACT, TRAFFtC, I^Q4h»FO amt »l3i^€l*6f 8ubnete9«)ri<s and the 
hierarchical organization of these mitwielworte^ into tep^^ subdivisi€Mis. Theses uJes 
ere analogous to the fonrial unlversais ^esoJlbecl lusaetion one o* tfiis chapter, 
because they not only constrain^ btrtirfso def in&^?tlie chw^acter ©f #i© ^ammar. 

The Morse code domain ateo specifies tte4^c*»iw0fefiH»^l«^t^^ 
existenceof 'noise' wortte, and the totemationai*y«Wi09rtc|-»^ns. These fules are 
analogous to the substantive univeii8al»rtri«s,;«ibi©feta<^«te assei^ns that a^^ 

13 

components and semsBitte ^esienJs mustto0idiwi«n from<)(iescrt)eAci3^^ 

Ttw buift-in ATfvl also constitutes a^et of *fwmal u»iversais', which constrain 
the character of rules that can a^jpeaa^ ift#-8Biimarei'8inc#itrigldiy ite^iesibe tm&ot 
grammar the program was denned to extertd. The^TlslBaodel preserteeathe t^fpes 
Of things that can be stored In registers* what tests snd^aetionscan do with reg^ers, 
and the push and pop (call and return)! tTieehaiiiSB»aii#ief»b©|W©d orijacMzation of 
§ubne*worf<s*ito a «r2Bisfttc»rn^iM0rti f ffiMMWr; 



'"h'he domain aspects listed here are discussed in depth in Sections 2.3 and 2.4. 
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4.3 Hypothesis Formalton and Evaluation 

For each exanpte trananfiisslcm, K^GE foraiuia^ a set of hypotheses for 
extending the syntactic/semantle structure of Ihe ATN, phis a set of hypotheses for 
adding function specifice^ons to extract the flfieai*infi^l content of t*ie«<areple. The 
mechanisms used here ar« similar to Mm Hr^juistic modete described in the first 
section of this chapter. According to those eRode^^ttie r^es fir«»irtated by the child 
must meet the universal conditions imposed or tf^^wracter (rf gra»pmatioal rules; 
iiJtewise, MAGE is limited to the forms presided by the set of nuxlels illustrated in 
Section 3.3. Neith^ the 'model d^W nor Ml^^ Is aven capable of considerifig 
grammatical hypotheses that do not meet their constednts* 

The proposed Knguli^ models pr«(JHi^ ttal tto ehttd will ignore sentences 
whose structure and/or vocabulary are too unusual, too diffW5»it from whsitshe 
already knows; MAGE returns a verdict of 'unlean«*rfe' every time it receives a 
difficult example, until it has acquired enough vocabi^ary aottdOlitextual structure to 
simplify the learning of this example to «^ ra^teWng of one new phrase to tts 
hypothesis-formation models. Botti MAGE and tha^moctel^iild' team by a bootstrap 
process. As MAGE is expoeed to more and more exswnple transmissloos, the 
conversatiOTTS it can parse becon^ more comptex. 

The core grammar of the gramma^^*«i^«®i«ii»«H^ine to similar to the pNot 
grammar discussed by Date [9], in that most chatter phrases revolve around one 
'pivot' word, often a q-sign, that determines the meaning of the other words. The 
ability to associate pivot words with only one or two potential subnetworks, coupled 
with the ease with which most pivot words are recognized (e.g. all q-slgns begin with 
the letter 'Q'), is probably the most important feature of MAGE's evaluation measure. 
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Without this ability, the selection mechanism would probably have to rummage 
through each of the thirteen subnetworks, possibly during several passes, to find the 
'best fit' for each example. 

The model of a hypothesis-selection mechanism proposed by Chomsky [4, 5] 
and discussed further by Dale [9], which would accept only those hypotheses that 
make it possible for the 'model child' to make sense of an utterance, according to 
her perception of 'sense', is analogous to the evaluation measure utilized by MAGE: 
a hypothesis is accepted only if it provides a parser with the ability to understand the 
example transmission. Extensions to the grammar are made in such a way that 
learning one new sentence actually results In the power to understand arbitrarily 
many new sentences, since many paths through the ATN may follow the new 
transitions. Thus the resulting grammar is descriptively adequate; theorists claim that 
a human grammar developed according to their models would also be descriptively 

adequate. 

MAGE does not use any of the particular universal rules postulated by linguists 
attempting to explain the very complex processes of language acquisition by 
children, nor does it copy the specific tenets of any of the theorized universal 
grammars (no one knows exactly what the universal grammar used by children 
actually consists of, or even whether it really exists). What MAGE does do is 
implement the concept of a universal grammar, with universal rules that severely 
constrain the development of a grammar that accepts the particular dialect of 
chatter being learned. MAGE also implements the idea of selecting only those 
hypotheses that provide an accurate mechanism for 'understanding' -- or in this case 
extracting the important Information from - the motivating example(s). 
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5. MAGE: A LearniTig System 

5.1 A Model for Learning Systems 

The organization erf the computer prpgranfiyvas^ strongly influenced by the 
research descrtoed by Smith et al. [20]. gnd MAGE ccwitonr© ck?sely to their model 
(rf a learning system. The model details the functional components felt to be 
essential for any learning system, independent of the techniques used for Its 
corrstruction and the specific environment In which It operates. 

Smith et al. defaie a teaming ag^stem as "S^ysys^ which uses Inforrnatlon 
oWalned during one interaction with its environment to Improve its performance 
during future interactions". The perf Ofmaoce of I^GE cs»nplies with this definition, 
as any examples that are added to the gramnrtar's|unden|tanding capability are also 
used by the bootstrap process to extend the grammar to accept future examples. 

The learning system model proposed by Smith consists of six elements. The 
Ir^tance Sele(^or selects suitable trainii^ instances from the environrpent. The 
Performance Element generates an output in response to a training instance. The 
Critic analyzes the output of the performance element in terms of some standard of 
performance. The Learning Element makes specific changes to the system in 
re^onse to the analysfe of the critic. The ^ec^^^rd contains syetem information, 
e.g. ttie emerging knowledge base, thet Is us^ ^M f unqtlonal components. Finally, 
the World Model cor^ios the general si^esumptkHW snd methods that constrain 
system activity. 

My experience with MAGE conformed to this rriodel in one additional way: as 
designer, I viewed the entire learning ^en) as a program whose performance 
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needs Improvement, and I selected instances, &mcm6pmiornmKei and made 
changes accordingly. \n other words, tiw deejgnep's ac^^ities can be modeled by a 
^tem wb<»e componerrts are identical to tbose described dK>ye. This leads to tt)# 
Interesting concept of layered learning systems, each higher layer able to change 
the world model (vocabulary, assumptions, etc.) of the next lower layer on the basis 
of criticizing its performance on a chosen set of instances. 

5.2 MAGE Components 

5.2.1 Instance Selector and Blackboard 

The Instance Selector performs the trivial operation of accepting whatever 
example the user provides and transforming It to the proper data structure for system 
manipulation. It may request the user to answer certain questions about the current 
example. For example, if the current example were "VVV ROCK DE SALT QSA ? K" 
("Rock, this is Salt. What is the strength of my signals? Over"), the Instance 
Selector would look up each word in the vocabulary list of the World Model and find 
that "VVV" is an unknown word. It would print: 'VVV IS AN UNKNOWN WORD. 
DOES IT HAVE A SYNONYM ON THE FOLLOWING LIST? (followed by the list). The 
operation of MAGE on this example is described in Appendix II. The Instance 
Selector provides half of the user-program interface. 

The other half of the user-program interface is the Blackboard, which prints 

statements about each extension the program makes to the grammar, e.g. 
[Adding new transition 'VVV' from state dtofiO of li>-bPi 

(the result of the above example). In addition, all communication between modules Is 

considered part of the Blackboard. Most communication takes place via standard 
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passing of arguments, and use of the same v&riat^ wh&n parts of one module are 
embedded inside another. There are also some i^obai varWbles ttiat designate vAxeA 
portions of the grammar have been altered (kirtr^ ttitel^tfrttfig session and oth^ 
dynamic informatkHi. 

5.2.2 World Model 

The World Model contains the universal grammar, ^^ which includes all 
knowledge MAGE has about the Morse code radio network domain. The core 
grammar is considered a component of the World Model. It contains the 
subnetworks diagrammed in Appendix III, but not any alterations that have been 
made during the current learning session: th^e belong to the Performance 
Element. The World Model has some concise, hand-gathered collectior^ of 
informational items that are distributed throughout the core grammar and wouW be 
difficult to find without these indices, e.g. the set of all subnetworks and symt)Ols that 
can immediately follow any terminal state in the QUAL-CNCT subnetwork. 

The World Model also includes a set of specifications for the tests and actions. 
A 'specification' describes in what circumstances the test or action shoutel be 
associated with a transition and what arguments shoukl be passed to the pre-coded 
function that implements the test or action. 

The spellings of sixty q-signs are known a priori by the system. Each q-sign is 

associated with one or more topical subnetworks »id a possible argument syntax. 

However, only five of the sixty q-signs appear on transitkHis in tf>e core grammar, 

and MAGE must rec eive at least one example for each of the other q signs in order 

^^Section4.2 
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to understand convereations' containing thsrtq-sijp. A sy^ni^r^t^^, which 
includes all q -signs and all other vocabulary contained In the core grarrwaar, is 
maintained. 

5.2.3 P«rformance©era«iit 

The Performance Element consists of two compoR«rts; « TN parser; fi^ttie 
currentversion of the ATHigi*«niar. The fN^lsarSaiis iMsed J6»n thf ATNl P^^r of 
GATNIP {16}, but It does wot 8a\« nor use airy cdiitei^ifidiinlormatlan*iSince it If onJy 
trying to accept a sentence or c©nvefsi*©» calhir toa«?;toylng te>€omprehend % It 
doesn't need tests to deterwine v^^ch woed^shB^Abat^J^ipted by^^atr^nsi^oajwlth 
the symbol •'any" because all <KKie-gioup8atti#€ni|Ki8hi«i<^^ 
the example. There is no mason thrtMAaEi^ecte to B8Cognfc:fii code-groMPS and 
English words && such, since this task iS8i«x»S8fel^p«f*ormBd by COMOEG [21]« 

AHlTOugh there is only asini^ctetasfeimtwneloiptofTientNl the ATN grammar, 
the core grammar is said to be an element of the Wor« Moctel, and the cuwent 
version of the grammar (i.e. the core grammarplssiiartoiis ©i^ensions^dependlng on 
the history of the ewrent teaming ^ssio^j'te' coroWered a c^mgKment of the 
Performance Elemwt. The coreei* ^ite of tieaoiprq-sipi ivocabMlary is also part (rf 
this element, while #»e o^gtaal vocMit^u^iSfMif^l^w WoskfeModel. This ieQn*<WT»s 
to Smith's model Of a learning process as opwp«Nl«0'*J9r*tna»#>t e^iange^ in the 
Perf^mam5^€temer^, wh^^e^ohfy the^iei^tier leaRjA^ 

VW»en «ie^Perform«K» Sement i8;peeeet8ifW>«»«)«^iple, it*e«che^P^^ 
more states where none of the transitions leaving those states matphes the. next 
word in the example (unless the example is already accepted by t^gratfrmiar). When 
this occurs. It passes a set of pointers to these states and a pointer to the next word 
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in the example to the portion of th© Critic that is embedded in the Pertormance 
Element. 

5.2.4 Critic and Learning Element 

The Critic performs three semi-independe^ ? iBnctfons: Eiiialualor, 
DlagrK3®tk>t£»i, and Thaaj^st. 

As Evafuator, tt ^^tkisrtes the Perf(M?iianoei£iem6iita abl% to p^se aach 
sample and tete' «« plirser te h«tt whw ^le (^ienw^aetlh^the pareer cannot 
understand the next phrase of ttie exim^tei Tlie Ewarfuator Is emi^edded in the 
Perfonnance Element. Aa described abewe, when the pieffiBeri hate * provides Vne 
Critic with the necessary stale teiNirma^» to pe^cvmlksliypQthaais-foiKationlask^ 

As Diagno8Uc^n«,^ie Oitic iocgtffi^attw reasons tor poor performance by 
noting at which 8tate(s) the patser was f orcfid to Ni^ It enw^er^es a se| of 
h^otheses based on the striK^itfai matehb^wapritlieaxaeipleandrttie localized 
positton in the grarwBar.^^ 

In Therapist mode, theCritieperforrastlw ©valuation measura.^^ it selects one 
of the hypotheses fomuii€tedwhlteoiftOi£snoa«i^a» mode, and returns t© 
Oiagnostici^ mod©. The Diagnosteiai aiMwwfalB© a 8^ of te$fe<aoM«» 
specifications, and m© Therapist select some ol those i^augmenittie tansittons in 
thene«^Choaai8tNictiM^«&^5r«d@n. ^r 

The Q^ific passes: the ehoaso €lr»QltfB8i:ttiid la^aettoR t^ff^*^^ 
LeBming Element, ¥»hii* ««ttz» kwwirtedgB of imptemenlaliop d«t«^ to^^etermlne 

'^SectionS.a 
^®Sectlon83.4and4.3 
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how to alter the ATN data structure to Include the current ex^t^ipn. Actually* the 
term 'Learning Element' may be a poor choice for this module since it simply makes 
the changes suggested by the Critic; howeveflSilfftv'lt^^/p^ describe* the 
'learning process' in as simply an addition Of irB^yformolatetf and sefecfed rules 
to permanent memory. 

5.3 Implementation Details 

The MAGE subsystem is implemented in MPL ('Muddle') [15] and runs on a 
Digital Equipment Corporation KA-1p under the ITS operating system. MAGE 
includes about 1300 lines of MDL code, and the compiled version requires about 47 
blocks of memory beyond the MDL Interpreter. (A block contains 1024 36-blt words.) 
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6. Conclusions 

6.1 Capabilities anci Limitations 

This repctfl descril?es the development of a compuj^r program, MAGE, that 
acquires and organizes much of the domain-specific l<nowledge required by the 
related system, CATNIP [16], to process conversations over Morse code radio 
networks. MAGE incorporates several of the levels of learning ability described by 
Winston [23]. On the lowest level, it 'learns' the domain -specific knowledge 
contained in its core grammar by being programrhed. On higher levels, It receives 
additional information by being told, in the language of the domain rather than a 
programming language, and it acquires the rest of Its domliin-speclfic knowledge via 
learning by example. It is not able to learn by discovery. 

MAGE uses the parser's ATN knowledge base as a 'core' on which it builds the 
developing grammar. The core contains a certain amount of domain knowledge that 
was readily available to the human who devetoped CATNIP and MAGE but couW not 
be acquired by the present version of MAGE. The inclusion of a core knowledge 
base represents learning lay being programmed. The core includes: 

• the discourse structure imposed on conversations by radio-network 
protocol 

• the types of information conveyed during Morse code conversations 

• the set of generic tokens and information alx)ut how to narrow down 
what should and shouW not be matched by tliese tokens 

• the spellings and meanings of the Internationally defined q -signs 

• the syntax of a few basic phrases and the meanings of ttie words that 
appear In these phrases 
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• the knowledge that 'nolste' words exist 

• how to format the various types of information for human-readable 
output 

This knowledge is reflected in the core as: 

• the top-down organization of the ATM knowtedfleb^e Into thirteen 
semantically categorize<J subnetworks 

• the internal structure of the core subnetwodcs 

• the registers, tests, and actions 

• a lexicon that associates the q-slgn&a^ptber sft«»rc|s cpntained In the 
core vocabulary with their synonyms, If arjy*^|pcfig ife)vv>wn wQitls 

• the printing functions 

MAGE receives as input individual transmissions, each containing either no 
new information or exactly one new frfirase. in some oases where the example 
contains unknown words, ^4AGE mo^ asktheiaw^adcWitandi if^ormi^n about 
the new words, and the user responds m the kMl^M8|J©<^i*i^^B^ than by 

additional programming: this is J^a3iOgl2^t3gl3gi^f^ 

MAGE derives enough information IttjHi^ sSiadH #^ the 

knowledge base to process the new ishnssti Ift th# c«*ttt6xt ^ the example 
transmission and related contexts* Tf^n»^«rtwi8fentoecora^^a»lf^egrirf pal of 
the grammar, utilized henceforth by CATNIP -- to sefectjie correct transcr^ion of a 
conversation from among the many transcriptions suggested by COMDEC and to 
produce a human-readable Summary of tl^lriformatton conveyed during the 
conversation - and by MAGE - *d aW In tfte bodtstrs^ fjroc^di*© thate^ the 
grammar. This process represents learning tjy exampie . The procedure followed by 
MAGE Is: 



.■,j?r!fBff; 
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1 . MAGE attempts to parse the example transmission 
using the cun-ent version of the ATN. 

a. If the example can already be parsed, get a new 
new example. 

b. Otherwise, the parse failed at some particular 
word in the exarhple Sentence; that Is, ft cacwld^ 
not advance any of the one or mdrepdrseiayhs by 
another transition matching this word. Cfirfl the 
last state in each failed patfi a 
'last-matched-state'. Call the word on which the 
parse failed the 'next-word'. 

2. MAGE looks for some word folldwing the next-word 
that follows the ericj of th6 ni5W phrase. 

a. This word and all words following this word in 

the sample match sQme c<^necto4|ipu^pP9 0^^ 
states ard tremsitions in the Al7^ fhitcsu^^TO^ 
reached, via foOs^ig tr^isttion$i^l^ qpie^or 
more of the last-matched-states. CaN the first 
s^sie in!eaclTSiM!M8e<^imi@e an^id)^(4^iNiKi^. 

b. Or, th^eis no such^word 2irtd^ie»i#w pli^^ 
ends at the end of ttie treuismission: tfie extenslwi 
representing the new pht^rtitfSt^hdlh fii^Hrt^ 
State, alsoc^^'eod-pf-phpaj^e'. , , 

a MAGE compares ^ai#i la«|fnfttch§#Tl^te/|^i^ 
phrase pair to the set of models, where any of tfie 
thrde states^rrespondi^ toUwserio ModetA may 
match the tast-matched-state and any taminal stito 
may malf^tfie end-of^pf^ii^, ds&peiiiHitf icin«i6 
particular model and circumstances. 

a. It finds one or more models fpreach pah" Jfriat 
could be used to construct an extenSon for Vie 

combination a 'template*. 

b. It selects the best template on the baste of a 
set of heuristics and consteucts the structural 
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component of an addition to the ATN, callal an 
'extension'. 

4. MAGE selects a setof test spectfications SK4a 
set of action specifications for each of the 
transitions in the extension. 

a. The specifications are chosen according to a 
set of heuristicsilhat cdns^erthe tiis*^sii»v ^ 
symbol, the context of the rest of the 
transmission, and the particular suBn^Wddc to 
which the extension was made. 

b. MAGE adds the specifications tft^»tPl»*>'^ 
constructed extension and gets a new example. 

MAGE ms^ extend the knowledge b?B9e5tO)lni;fci«les an #Mtr^^ number 

of new phrases for discussing the concept^^^|Q5f^ fey^^^^ discourse 

structure. It augments the transitions that process the words of these phrases with 
tests that provrde filters for generic tokens atnd actlonis that Extract »fid Information 
from a phrase that provides temporary context and contents for the summary output. 

MAGE may be considered an ttfl^lMPBn^i«^d^8«^5>e Wnfl^stlp '^#® °* 
human language acquisitk)t» proposed by §$pmsk this analogy Is very 

natural, sinc^ langua^ gKjquisltion seemS v(if^ t;1osply rela^ grammatical 

extension. 

• The domain-specific knowledge contained In the core knowledge base 
corre8porK*stottieinftate*ur#i^rsalgfaiWH^r'. ' 

• The example transmissions e^orrespond to the i^erartees heard by the 
'child'. ^ 

• The models and associated heuristte* eornespohd to ^ 'universal 
rules'. 

• The creation of several templates ahd q^skter^tlon of possible 
test/action specifications corresponds to the fdrmatibn of competing 
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hypotheses. 

• The construction of one extension that processes the example 
corresponds to the selection of one hypothdsto «iat adequctety e^^Mns 
the data. 

Even if these models turn out to be poor descriptions of the learning processes 
actually used by children acquirir^ their liaiive lar^gyagf, this research has 
demonstrated that these theories are still useful in thfeb^nbf doniputer programs 
that successfully learn by example. 

However, MAGE has many HffrttattcamsJ 

• It is not able to recognize changes to the discourse structure or to the 
type <Jf infoitnatldn com^yetf duriiisa coiTh®i««on«v Shoiric^^^ese o^^ 
In other words, it cannot create new subnetworks, registers, tests, or 
actions, nor discard existing dnes; ' ' ^ 

• It includes no mechanism for automatically adding the tueanlngs of new 
qslgns ©r other y(»cabiJlacy wcpds^ untess th^se words ace synonyms of 
previously known words; however, this can be easily programmed by a 
huff^tfi. 

• It assumes thfe existence of an Intelligent widknowiedQeable user, who 
does not simply type^ in complete new transcripts but rather edits the 
example transmisstoris so that they ea^WcWde^a^ 

This means the user shouW have some knowledge of the current 
capabilities of the knowlodge base. ^oirtunat^'^Adfe performs 
adequately most of the time with a naive user, except where ^e 
transcript includes a large number of 'interruptions'. 

• Most notably, the current version of ^GE can not deal wiUi the 
interruption problem and is able neither to extend the Interrupt 
Subfietwerk and r^ted lower-teirtl sufen^worHs nor filter out 
interruptions from example transmissions. 

These limitations are what separate granimatical or knowledge-base extension 
from grammatical inference. If MAGE could do all these things, it would be able to 
acquire, from transcripts, all the domain-specific knowledge required by CATNIP. 
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That Is, it could learn ^ diac^werv . the highest and leapt unc^topd form of 
learning. 

A system that could, do all the tWng» Hst©d above, wWiogt prior 
domaln'S|»ec#tef knowtedge* could &mfm^(mMf^ wm^^ = *e particular 
domaln*8p^itfie*nowledg© required t^ 2^ sfs^Wi \(i*^iif ifiipwlei^e base cc»*ld be 
derived by a humstfi Irom a r^isoni^rfe anii»«it ofrdfi^iaken dir^tty from the 
domain and orgamied as an augmente(l*«T8W©«i nii^^^e^-llwould bea solMtlon to 
the very difWcult prot^smf of grammaticaH*#M?wil»^^(tmcQnti3|t#^aWve 

6.2 Suggestions for Future Research 

The research described In this report represents a small step In the 
development of a grammatical -inference machine that could construct the 
knowledge base or grammar necessary to parse a natural language from scratch, I.e. 
without requiring a programmer-defined organizatton of subnetworks, registers, 
tests, and actions. The design of this machine would require the removal of all the 
limitations described above, which involves rinding the solution to two major 
artificial -intelligence problems. 

One of the problems to be solved is grammatical inference of the transition 
network component of the ATN from an incomplete set of examples, each containing 
an arbitrary amount of new information and an arbitrary amount of old Information. 
The current state of machine inference of context-free grammars, which are 
equivalent to non-augmented transition networks, assume a sfrucrt/ra/Zy comp/efe 
sample set. However, It is Impossible to put together a sample set using every 
production or rule in a grammar when It has not yet been agreed what all the rules 
are for any natural language: Therefore, either a new Inference algorithm with 
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different assumptions or' a compieteiy different method fm deriving griarwnars lor 
natural languages would have to be developed. 

The s^titlon to the otier problem requires theai^OBKrtiorr of fc>oth the process 
erf recognizing ttie needier cer^nregl^ers, dftd the process of writiiae algorithms, 
or abstract function descrlfrtlons^ for the test&aB^Hac^ora^^QneejB^ a^gorathev hsts 
been gener^fed In some ^«ple 'progranwlil^ toaguage' fcnovim by ttie f eai^lng 
program, a human programmer i%>rtd ©ocfeft»i»t©8l8 arid actfofw in the actual 
languag&{e.g. P*ascsa, PL/t, MDL) sultaWe for th»p8H^culaneewironmenfc 

Both problems might be considerably more tractsrf^le If restricted to Morse 
code or some equally simple domain, and If they could be solved Independently. That 
Is, the ability to utilize the domain-specific knowledge inherent in a programmed 
version of one of the two components may make It easier to develop an automatic 
mechanism to perform the other functk>n. 

For example, a grammatical-Inference machine might use some domain 
knowledge, such as the topic of q-slgns or the type of information conveyed during 
conversations, to develop the set of sul)networks for processing h4orse code 
conversations. The Morse code domain simplifies the test/action problem b)f 
restricting the potential contents of registers to words and phrases selected from 
transmissions. Tests are restricted to putting additional constraints on generte 
tokens by comparing the contents of registers to thie current word(s); actions are 
restricted to selecting/storing Important information and delving Information that Is 
no longer desired. This knowledge might be utilized by a program that automatk^aliy 
generate registers, tests, an6 acticms. 

Regardless of whether these proljlems are ever dealt with for the specific case 
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of automatic generation of the knowledge base for parsing Morse code 
conversations, it is hoped that they will someday be solved for the general case, so 
that machine acquisition of natural language will become possible. 
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I. A Morse Code Conversation 

A typical example of a goal-prlented Morse code conversation is given below, 
with each transmission followed by an English transcription. 'R(XK' and 'SALT are 
two operators. Very little of this conversation can be understood by the parser using 
only MAGE'S core grammar, which is pTe^nted Wt A^i^>ertcllx til, although aft of it can 
be parsed using tfie complete grammar actually used by CATNIP. However, MAGE is 
capable of extending the grammar so that the parser can 'understand' this entire 
conversation. The sample learning ^ssion pr^ented^ln i^pi3^dric II ^bvi/s how 
MAGE extends the core grammar to undera^^iR^ new trarffirrUssions; many of the 
transmissions in this conversation are used as exBBiplea. • 



VVV VVV ROCK ROCK ROCK DE SALT SALT QSA ? K 

("[Hey] Rock, this Is Salt. wrrat»M& afcwi^^fl^f«i^^lsJ 
Over") 

VVV VVV ROCK PE SALT QSA ? QRK ? P^A ? QRK ? QTC QTC K 

("[Hey] Rock, this is Salt. What is the strengtt^ of my ^gjials? 
What is the intelligibility of my signals? [Can you hear me?) 
Ihavemessagesforyou. Ovei^ v^ J ^or m 

SALT DE ROCK QSA 5 QRK 5 GA K 

("Salt, this is Rock. The strength of your signals fe^vftry^KXXi. B»© 
intelligibility of your signals is excellent. [I can hear youl] 
Go ahead. Over") 

HR TFC HR TFC OK ? K 

("Here's some traffic. [I'm going to send a message now.] Okay? 
Over") 

QRVK 
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("I am ready. Over") 
NR 1 GR 200 1 600 BT <1ob code-group8> BT BT <1dO code-group8> BT 
QS1.7K 

("[Message] Number one, with 200 groups, at ISOQ hcMjrs (3 p.m.) break 

<100code-groups>break<100code-gtoup4>br^. 

Can you ad^nowtedgej-jeceipt? Over") 

N N PSE RPT GRPS 25 , 40 , 98 K 

("No. Please repeat groups 26, 40, and 98. Over") 

OK OK GRP 25 <cdde-group> / (jMP 40 <code-group> / 
GRP 9a<cocle^gro4M)>K 

C'Okay. Group 26 is <code-group>. Group 40 Is <code'group>. 
(jxoug^te<t6d&-gTOup>.Ws^ - 

TKS QSL UR MSG NR 1 >««fK 

("Thanks. I am acknowledging receipt erf your message number one now. 
Over") 

QTC?K 

("Do you have aiy mefflsageslorfn^Oiiw^) 

QRUQRX7K 

("I have nothing for you. Whenlfiffl ycKJ call me agaiir? Ov^) 

QRXNXTTMWOK?K 

("I will cali you again tomorrow. Okay? Owr*^) 

CCSKSK 

("Okay- &Ki of conteca") 

VA 

("End of contact") 
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II. A Learning Session with MAGE 

An example of MAGE's performance is given below for each of the seven 

general models presented in Section 3.3. The prose in brackets is that printed by 

MAGE for the given example. In each case, Figure a shovAfflibe model selected by 

the hypothesis-formation algorithm; Figure b displays the original subnetwork 

selected by the evaluation measure; and Figore c gives the result of applying the 

model to the example and the chosen subnetwork. Since it is difficult to show tests 

and actions in the diagrams, the selected test/action specifications are presented in 

the brief discussion below 6a£li«D(ample. 

Example 1 

RQCK DESALT PSE ANS QTC K 

[Changing state 1 of TFC-INFO to TERMIMALj 



© (3— — 



r MODEL 13 



Figure 10a: MoelQirl 



x 
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C CORE TFC - \kfOt 



Figure 10b: Cor«^!P©^#M® 




LEXTENDED TFC-INF03 



Figure 10^ ixtonddd^PC-INFO 
("Rock, this is Salt. Please answer, I have messages for you. Over") 
The phrase "RCX^K DE SALT" Is accepted by the ID-OP subnetwork (Figure 
19), and "PSE ANS" is accepted by the QUAL-CNCT subnetwork <Flg. 21). When a 
phrase accepted by ID-OP is followed by a phrase accepted by QUAL-CNCT, the 
two phrases together are accepted by the CONTACT subnetwork (Fig. 18). This 
subnetwork may be followed by the TRAFFIC subnetwork (Fig. 22), as well as by 
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another occurrence of CONTACT, as shown in the OVERALL subnetwork (Fig. 17), 
the highest level subnetwork in this ATN. "QTC" matches the symlx)! on the first 
transition of TFC-INFO (Fig. 23), wNch is pushed to (calted) by the first transition of 
the higher-level TRAFFIC subnetwork. However, "K" does not match the next 
transition in TFC-INFO; instead, it matches the transition following TFC-INFO in 
TRAFFIC. This indicates that the next-state of the "QTC" transition shouW be a 
terminal state so it can pop (return) to TRAFFIC, so MAGE changes It. 

Since no transitions are added, it is not necessary for MAGE to consider 
adding new tests or actions. 
Ex ampte 2 

QSLMSGNR37K 

[Adding new transition '7' to Stat* 4 of M^KNOW] 

[Also adding 1 new states to ACKNOW] 

[States: TERMINAL] 

© — <D — O— 

C MODEL 2] 

Figure 11a: model 2 



74- 



(I>-5;:-<1>-j;5F<B^-ij?-h5>-5— Q 



[CORE ACKNOWJ 



Figure 11b: core ACKNOW 



[EXTENDED ACKN0W3 

Figure lie: exterxied ACKNOW 
("Can you acl^nowledge receipt of message ruimber three? Over") 
The phrase "QSL MSG NR 3" is accepted by the ACKNOW subnetwori< (Fig. 
28) and "K" matches the symbol on the transRion following a (call) push to 
ACKNOW in the higher-level REG-INFO subnetwork (Fig. 26). Since it is known a 
priori that extensions should be made to lower-level rather than higher-level 
subnetworks whenever possible, MAGE adds a transition "?" to the tenninal state of 
ACKNOW and creates a new terminal «tate that pdps (returns) to REQ-INFO. 

Now the action [SCRATCH input] (store Input token In <scratch-pad> register, 
destroying the previous contents) is already associated with "QSL". Since "?" refers 
back to the q-sign, the action [Q-PEND SCRATCH] (the token In <scratch-pad> was 
used as a question; put it in the <pending-question> register of the receiving 
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operator) is associated witli the new transition.^ ^ 
Example 3 

NR 1 GR 300 QTR 1 400 any BT QSL ? K 

[Adding new transition 'QTR' to state 4 of HEADER] 

[Also adding 1 new states to HEADER] 

[States: # . to TERMINAL] 




t MODEL 31 

Figure 1:2a: Model 3 



(2h:;KI>7-<S>»Ki)-rKi)-^-0 



[CORE HEADER] 

Figure 12b: Cor^^APER 



^ ^ AH tests and actionji are deRhed in AppencHx IH. 
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(Ih^p<I>-^-(Ih^r(^ 




CEXTENOED HEADER 3 



Figur# 1 2c: Extended fCADER 

("{Now sending message] nuinber one, with 200 groups, at the time 1400 
hours. Break <co(ie-groups> break. Can yoa adoiowledge receipt? Over") 

"NR 1 GR 200" matches the first few transitions erf the HEADER subnetwork 
(Fig. 24) and is foltowed by a tranettion matching "1400" (i.e. the symbol on this 
transition is " # "). "any BT QSL ? K" te acc^Jted by the MESSAG subnetwork (Fig. 
25), which foWows HEADER in the higher-level TRAFFIC subnetwortt (Fig. 22). Thus 
"QTR #" appears to be an alta-nate way of phrasing this last "#", so MAGE 
creates two new trar^tions "QTR" and "#", wtth a new sftsrte between them, in 
parallel with the original transition for "# ". 

Since "QTR" is a q-sign foltowed by an argument, the actfon [SCRATCH input] 
is associated with "QTR" and the actions [Q-VAL input] and [Q-ACT SCRATCH] are 
associated with the argurwent. [StRATCK Ir^ws^ stores the input token in the 
<scratch-pad> register, destroyfcig the prevtous contents; [Q-VAL input] adds ttie 
next input token to the <scratch-pad> wm^ witt^out destroyUig the prevkx^ 
contents; [Q-ACT SCRATCH] removes the q-sign and its argument(s) from 
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<scratch-pad>, determines which register to put them in, and puts them there. The 
possible registers include <eKpectecl-actlqns>, <quaNty-of-contact>, 
<general-situation'description>. In addition, since "QTR #" is another way of 
phrasing the " # ", any tests or actions on the original transition must be copied to 
the new ones: therefore, [GMT-TIME in|xit] is a^ associated with the new transition 
for " # ". The action [GMT-TIME input] puts the input token, indicating time of 
transmission, in the <time-and-date> register. 
Example 4 

VW ROCK DE SALT QSA ? K 

'WV' IS AN UNKIIOim WORD. DOES IT HAVE A 
SYNONYM ON THE FOLLOWING LIST? 

<list of known vocabulary words that are notq-signs orcaH-signs> 

N 

DOES 'VW' HAVE A QSI6N SYN0NYN7 

N 

COULD 'VW' BE CONSIDERED A 'NOISE* 
WORD? 



[Adding n«w transition 'VW' fron stat* 
to of ID-OP] 



GD <D O 



[ MODEL 4] 



Figure 13a: Model 4 
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(i)^7jj:j;:0-^sf-(^^ 



CCORE lO-OPJ 

Figure 13b: Core ID-OP 

LEXTENDED ID - OP] 

Figure 13c: Extended ID-OP 
("[Hey] Rock, this is Salt. What is the strenglh of m siflnals? Over") 
Since "VVV" is a new word, MAGE asks the user to supply some information 
about its meaning. Since MAGE istokitt^t "VVV" i^ a !|wi8a',4|ford, and it fe followed 
by "ROCK DE SALT" which is accepted by the ID-OP subnetwork (Fig. 19), MAGE 
adds a new transition "VW" as a loop on the start-state of ID-OP. 

There are no tests or actions associated with noise words. 
Example 5 

NR 2 GR 150 16(X) any BT any BT QSL ? K 
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[Adding new transHlon Vany' to sUU t of MESSAO} 
[Tho arc has noxt-atatt 1] 



©=o — o 



C MODEL 5] 



Figure 14a: Model 5 



ony 



07;7-*O-iHl>w<5N-^^ 



[CORE MESSAG] 



Figure 14b: Cin^e MESS AG 



any 




(I)-^^^r\^>=XJhsr^^ 



CEXTENDED MESSAG] 



Figure 14c: Extended MESSAG 
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("[Now sending message] number two, with t50 groups, at 1600 hours. Break 
<code-groups> break <code-groups> break. Can you acknowledge recfflpt? Over") 

"NR 2 GR 150 1600" is accepted by the HEADER subnetwork (Fig. 24), whtoh 
is followed by the MESSAG subnetwork (Fig. ^) In the higher-level TRAFFIC 
subnetwork (Fig* 22). "any BT" is matched by Ihe first Iwo transitions of the 
MESSAG subnetwork, but the second "any" does not match any transitions leaving 
state 2. Rather than branch to a new path that merges with the oW at "QSL", MAGE 
notes that the second "any BT" also watches theilrst two transitions of MESSAG. 
MAGE creates a new transition that returns to state 1, so this new phrase can be 
repeated indefinitely. 

The tests and actions that are associated with the or^kial "any" transition 

from the start-state to state 1 are copied to the new "any" traRsi^on: test [GROUP? 

inputfand action [AOD-GRCXIP input]. [GROUP? Input] returns Tf^UE If the input Is 

probably a code-group or English word; [ADD-GROUP Input] Increments the 

<number-of-words-receivBd-so-faMn-messago> register, and puts the input token in 

the <last-word-received-in-messs^s>.mQ^AeCr»i^y^ ^^ Mseful for error-recovery. 

Example 6 

QRX?K 

[Adding n«« transition 'QRX' to stato of END-CKGit] 

fAlso a4d4fltr 1 now atatoa^ EUD^GIiCrl 

[Statas: 7 . to TERMINAL] 
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[MODEL 6] 



Figure 15a: Model 6 




'SK' . 'K' 



<J 



C CORE END - CNCTl 



Figure 15b: Core END-CNCT 
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LEXTENOED EN3-CNCT] 



Figure 1 5c: Extended END-CNCT 

("When will you call me again? Over") 

Here is a situation where the first word of the example doesn't match aoy 
transition leaving a start-state. However, the q-sign "QRX" is semantically 
associated with the END-CNCT stttMl^tti«>fl« <Ffe.' 29). Since "K" appears on a 
transition to a terminal state in END-CNCT, and the END-CNCT sul>network can 
follow itself in the highest-level OVERALL subnetwork (Fig. 17), the new phrase 
"QRX ?" is added t&END-CNCT as a newpm. —■ 

Since "QRX" Is a q-sign followed by a likely argument, it is associated with the 
action [SCRATCH input], whteh saves the q-sign in the <scratch-pad> register until 
its argumenl(s) are collected. The argument "?" Is associated with the action 
[Q-PEND SCRATCH], which notes that the q-sign found In <scratch-pad> was used 
as a question and stores it in the <pending-question> register of the receiving 
operator. 
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Example 7 

QTC?K 

[Adding new transition '7' to state 1 of TFC-INFO] 

[Also adding 1 new states to TFC-INFO] 

[States: TERMINAL] 




[MODEL 7] 




Figure 16a: Model 7 
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tEXTENDED TFC-JNFOJ 



Figure 16b: Recently extended TFC-INFO from Figure 10c 




[EXTENDED TFC-INF03 



Figure 16c: TFC-INFO extended further 
("Do you have any messages for me? Over") 
In this case, another extension is made to a previously extended subnetwork 
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(see Figure 10 above). "QTC" matches the fir^ Iranation in the TFe^lNFO 
subnetwork (Fig. 23), but "?" does not m^ch the transition leaving this state, nor 
cJoes it match any transition leaving the state In the TRAFFIC sul)network (Fig. ?2) 
that can be popped (returned) to from this terminal state. Since "?" is likely to be a 
q-sign argument, a branch is cheated in TFC-INFptha|tec>cls in a new terminal state. 
(Actually, this terminal state Is merged with the oAher terminal state that has no 
transitions leaving it in order to minimize complexity.) 

Since "?" refers to a q-sign, and [SCRATCH input] is already associated with 
that q-sign (and will store the token in ihp <scratch-p^p> register), the action 
[Q-PEND SCRATCH] is selected for the new transition (to retrieve the qslgp from the 
<scratch-pad> register and put It in the <pend/nsi-guesf/pn> register of the receiving 
operator). 
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HI. The Core Grammar of MAGE 

This appendix includes a Hst dt the chatter words t»at appear in ^ie a>re 
grammar, Illustrations of the sut»ietworks 6ompo^ng the core ^"ammar, a ffet tff 
registers, arid descriptioiis of the tests and actfdns. AWiou^h ttie registers, te^ 
and actions are the same as used by CATNff»^fie],^ttie vocabu*ary and gmmnm of 
MAGE are considerably smaHer than the gramrwaf used toy GATfW*. 

Vocabufary 

?" question mark; punctuation arKl a q-i^gnaffturnwit 

#-- generic matched by any mifrtber 

ANS--"ans¥ver" 

any - match^ any code-group or Engl^ word irim^sdoe 

BT - - "break " ; a pro-sign 

cailsign -- generic matched by any (known) caii-sign; MAGE cannot recognize 
call-signs without being toM 

DE - "this is" or "from" 

detim - generte matching any delimiter: break or punctuation 

GR - "There will be code-groups or English words in next message" 

GRPS - "groups" 

K - "end of transmission"; a pro-s^n 

location - generic matcfied by any (known) kx:atk»i 

MSG - "message" 

new-spe£ri<er - denotes speaker change 

^®Section2.2 
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NR -- "number" 

PSE -- "please" 

QRX-- "I will caU you again at - - - hours" or, if fallowed "?", "When will you 
call me again?"; although MAGE knows the spelling and tppical associations of sixty 
q-signs, the q-signs listed here are the only ones that MAGE knt^ft^ how to use in 
context (because they appear as tran^tion syR^>ls In thejcme grfflrowar) 

QRZ - "You are being called by - - - (on freqttency )", or "Who Is 

calling me?"; parentheses indicate an optional argument 

QSA - "The strength of your signals Is ", or "What is the strength of my 

signals?" 

QSL - "I am acknowledging receipt (of i", cy "Can you acknowledge 

receipt (of )?" 

QTC - "1 have messages for you", or "How many messages have you 

to send?" 

RPT -- "repeat" 

SK - "end of contact"; apro-sign 

ZOH - "There will be - - - axle-groups in the next message" 

S«ibnet works 

Legend: 

• States are represented by circles and transitions by arrows. 

• A circle containing an S represents the subnetwork's start-state. Any 
circle with a darkened area r^resents a terminal ^ate. 

• Each transition has one or more transition symbols. If a transition has 
more than one symbol, they are separated by commit. 

• A word composed of upper-case letters surrounded by (single) 
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quotation marks indicates that this transition accepts the particular 
chatter word. 

• A word composed of upper-case letters, but not sun-ounded by 
quotation marte, denotes a iiHish (call) to me rtawwrfsiAjnetwork. 

• "(new-speaker)" denotes a speaker change, or switch of receiving and 
sending operators 

• Other wordscompt^ed of toi«er-cas(^i©^is^anct'l#"i denote generte 
tokens that are replaced by speclfte chatter words at parse-time (e.g., 
"catlsign" maybereiirtacedby "ROCX^ an bpeiator'scirtl'^n). 



CONTACT 

(new - speaker] 



( new - speaker) 



R£Q-II^O 
(new -spvaherV 



(D 



CONTACT 




[CORE OVERALL] 



^sim-CNcr 

( new - speaker) 



Figure 17: OVERALL, subnetwork 
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[CORE 


CONTACT] 



Figure 18: CONTACT subnetwork 



Vl/ callsfgiK^ 'DE' ' \U call sign ' vV 



[CORE ID-OPl 



Figure 19: IO-(^8U^etwork 
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\^ '0/?Z' 'vly '^' V_y callsign^\^ 



[ CORE NET -RELAY] 



Figure 20: NET-RELAY subnetwork 




[CORE QUAL - CNCT] 



Figure 21 : QU AL-CNCT subnetwork 
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TFC- INFO { new - speoker ) 







Tf'C-INFO 




'BT'. 'K' 



TFC -INFO 




f 2 



HEADER 






CCORE TRAFFIC] 



Rgure 22: TRAFFIC subnetwork 




[CORE TFC - info: 



Figure 23: TFC-INFO subnetwork 



G>-.;K!>^<!HHi>^-0-^K3 



CCORE HEADER! 



Figure 24: HEADER subnetwork 
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any 



(^y7;Airir<^hic<^y^^ 




[CORE MES 



ik'Sl 



Figure 25: M^8A@ si^xielwMirfc 




[ COB£- BEQ - tNFOI 



Figure 26: REQ-tNFO subnetwork 
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dtHm 



:0 



d«lim 



any 




[CORE RE0-R*'T1 



Figure 27: REQ-RPT sulxietwork 



(iHsrKlHj^Ki)-^^;^^^^ 



[CORE ACKNOW] 



Rgure 7B\ ACKIIOW srfanetwork 
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& ^TPTV^ — 'Q 

[CORE END -CNCTl 

Figure 29: END-CNCT subnetwork 
Registers 

<information-about-receiving-operator> - Call-slgtH' locatton of station, and 
other information regarding ©urrent^ecelver. 

<information-about-s0nding-operitor> 

</asf-wofd-fece/vecf> - UseMfor^Tor-refpjvery. 

<time-and-dat0> 

<scratch-pad> ■■ Temporary storage for saving arg^umente, etc. 

<number-of-words-in-m0ssage> 

<id-number-of'fnes$age> ■■ UsuaHyi^umbered in order of sending. 

<number-of-words-received-so-tarrin-messag0> - Useful for comparing with 
contents of <number-of-words-in-message> register to determine wfietfier entire 
m^sage fias t)een recei>^. 

<last-word-receive^Nnfm98$a^9>:'^M90V^ ^ frnsffecovery. 

<general-situation-description> •• Description of radio-network status. 

<quality-of-contact> -■ Description of station status. Tfiere is one of these 
registers for fiacb active operator. . 

<expected-action> - Actions that an operator is expected to perform, usually 
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in r^poni^ to request; this provide a context for iwipredlctable actions. There is 
one of these registers for eacli operator. 

<pending-questions> - Questions an operator is expected to answer; this 
provides a context for unpredictable phrases that m^ht be emymrs^io que^or^. 
There is one of these registers for each active operat<y. 

<requests-for-repeats> - Reque^ for something (usually a code-group) to be 
repeated. 

Tests 

[GROUP? input] -■ Returns TRUE if the argument is not a q-sign or delimiter; 
Used only when transition symkK>l Is " any" . 

[NOT? <list>] - Returns TRUE If the if^aut word Is not a member of <list>; used 
when transition symbol is "any". The argument 'input' does not ai^^ear explicitly In 
this test specification because test and action ^jesctfications are constrafcied to 
include only one argument; however, the actual fWKJtions that implement these tests 
and actions also have access to the set of context registers end the current input 
token. 

[-RECEIVER? input] - Returns TRUE if token is Dflt (due to '-') the same as 
the call-sign in the <information'abdut-receMng'Operator> regtet^; used only when 
transition symbol is "callsign". 

Actions 

[RECEIVER Input] and [SENDER input] ■■ Put Input token in call-Sign fleW Of 
<information-about-receiving-operator> or <information-about-sending-operator> 
register, respectively; synrbol is "callsign". 

[NSPEAK T] - Switch contents of <information'about-receivmg-operator> sand 
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<information-about'Sdnding'Operator> regtsl»^, If noii«einf^; symbol te 
"new-speaker", denoting speaker change. 

[SCRATCH IfipuQ > Put *iput token In n\e<8cratch'pad> register, destroying 
previous contei^; S!^boi aut>itrsry. 

[Q-VAL input] - Add input token to list ol tokens la <scrafc/?-pad> register 
without destroying prevfous contents: tfie fic^ etein^t oH li^'is the pivot word, others 
are Its arguments; symbol arbitrary. 

[Q-ACT SCRATCH] - Get pivot word (usually q-sign) and arguments from 
<scratch-pad> register and put In one of 4he ^quamy-iH'ContacO, 
<expected-actions>, or <general-situation-d!ss&tifition> registers, depending on 
meaning of pivot word and its argument's); symbol artiitfary but always preceded 
directly or indirectly l}y a pivot word. 

{Q-ACT input] - The pafftieuiar pivot wofd is i^t likely to have arguments, so 
proceed to put it in one of the alx>ve registers; syn#E)l*»^K% a q^sign. 

[Q-PEND SCRATCH] - Get pivot word from the^«cf alc/?-pad> register and put 
in the <pending-question> or <expected-action> register, depending on the meaning 
of pivot word; symt)ol fe "?". 

fMSG-^«JM input] -- Put token in <id-num^er-of-messs^> register; this is the 
identification number of the next message; symbol Is "# ". 

[TFC-GR-NUM input] - Put token in.^oumber-of-words-/n-message> register; 
this is the number of cocte-groups or English mmd» to bo sent in the next message; 
symbol"*". 

[GMT-TIME input] - Put token In time field of <t/me-anc/-dafe> register; this Is 

time of transmission of most recent message; S)^isl)ol " # ". 
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[ADD-GROUP input] •■ Put token In the <last-word-received-in-message> 
register, useful for error- recovery, and increment the 

<number-of-words-received-so-far-in-message> register; symbol "any". 

[LAST-GROUP T] Compare contents of the 

<number-of-wordS'received-so-far-in-message> with contents of 

<number-of-words-in-message> register; if former < latter, tell COMDEC to turn off its 
code-group recognition mechanism; symbol is "BT" or some other break. 
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example sentence. A set of models act as templates to produce possible extensions 
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the grammar. This extension is henceforth an Integral component of the knowledge 
base and may be used by the parser to process conversations and by the learning 
program to exterid tfie grammar further. 

This report relates the mechanisms used by the learning program to 
grammatical inference of context-sensitive languages, which include ^:ttie natural 
languages, and some proposed linguistic models of human language acquisition. 
These models describe language acquisition as a process of developing hypotfieses 
according to the constraints of innate universal rules, and acceptance of tfiose 
hypotheses that make it possible for the child to understand new sentences. 
Similarly, the learning program develops its hypotheses within tfie constraints of 
certain 'universal' models and accepts only those hypotheses that enable the parser 
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